ML Project 3 (Post 3)

April 27, 2014

Last night I got the FNN Classifier working on the 16-input 10-output data file for Project 3 in my Machine Learning class.

Here’s the output!

Read More

ML Project 3 (Post 2)

April 27, 2014

Tonight I have learned how to use PyBrain’s Feed-Forward Neural Networks for Classification. Yay! I had already created a neural network and used it on the project’s regression data set earlier this week, then used those results to “manually” classify (by picking which class the output was closer to, then counting up how many points were correctly classified), but tonight I fully implemented the PyBrain classification, using 1-of-k method of encoding the classes, and it appears to be working great! The neural network still takes a while to train, but it’s much quicker on this 2-input 2-class data than it was on the 8-input 7-output data for part 1 of the project. I’m actually writing this as it trains for the next task (see below). The code I wrote is: pybrain FNN classification Python print("\nImporting training data...") from pybrain.datasets import ClassificationDataSet #bring in data from training file traindata = ClassificationDataSet(2,1,2) f = open("classification.tra") for line in f.readlines(): #using classification data set this time (subtracting 1 so first class is 0) traindata.appendLinked(list(map(float, line.split()))[0:2],int(list(map(float, line.split()))[2])-1) print("Training rows: %d " % len(traindata) ) print("Input dimensions: %d, output dimensions: %d" % ( traindata.indim, traindata.outdim)) #convert to have 1 in column per class traindata._convertToOneOfMany() #raw_input("Press Enter to view training data...") #print(traindata) print("\nFirst sample: ", traindata['input'][0], traindata['target'][0], traindata['class'][0]) print("\nCreating Neural Network:") #create the network from pybrain.tools.shortcuts import buildNetwork from pybrain.structure.modules import SoftmaxLayer #change the number below for neurons in hidden layer hiddenneurons = 2 net = buildNetwork(traindata.indim,hiddenneurons,traindata.outdim, outclass=SoftmaxLayer) print('Network Structure:') print('\nInput: ', net['in']) #can't figure out how to get hidden neuron count, so making it a variable to print print('Hidden layer 1: ', net['hidden0'], ", Neurons: ", hiddenneurons ) print('Output: ', net['out']) #raw_input("Press Enter to train network...") #train neural network print("\nTraining the neural network...") from pybrain.supervised.trainers import BackpropTrainer trainer = BackpropTrainer(net,traindata) trainer.trainUntilConvergence(dataset = traindata, maxEpochs=100, continueEpochs=10, verbose=True, validationProportion = .20) print("\n") for mod in net.modules: for conn in net.connections[mod]: print conn for cc in range(len(conn.params)): print conn.whichBuffers(cc), conn.params[cc] print("\nTraining Epochs: %d" % trainer.totalepochs) from pybrain.utilities import percentError trnresult = percentError( trainer.testOnClassData(dataset = traindata), traindata['class'] ) print(" train error: %5.2f%%" % trnresult) #result for each class trn0, trn1 = traindata.splitByClass(0) trn0result = percentError( trainer.testOnClassData(dataset = trn0), trn0['class']) trn1result = percentError( trainer.testOnClassData(dataset = trn1), trn1['class']) print(" train class 0 samples: %d, error: %5.2f%%" % (len(trn0),trn0result)) print(" train class 1 samples: %d, error: %5.2f%%" % (len(trn1),trn1result)) raw_input("\nPress Enter to start testing...") print("\nImporting testing data...") #bring in data from testing file testdata = ClassificationDataSet(2,1,2) f = open("classification.tst") for line in f.readlines(): #using classification data set this time (subtracting 1 so first class is 0) testdata.appendLinked(list(map(float, line.split()))[0:2],int(list(map(float, line.split()))[2])-1) print("Test rows: %d " % len(testdata) ) print("Input dimensions: %d, output dimensions: %d" % ( testdata.indim, testdata.outdim)) #convert to have 1 in column per class testdata._convertToOneOfMany() #raw_input("Press Enter to view training data...") #print(traindata) print("\nFirst sample: ", testdata['input'][0], testdata['target'][0], testdata['class'][0]) print("\nTesting...") tstresult = percentError( trainer.testOnClassData(dataset = testdata), testdata['class'] ) print(" test error: %5.2f%%" % tstresult) #result for each class tst0, tst1 = testdata.splitByClass(0) tst0result = percentError( trainer.testOnClassData(dataset = tst0), tst0['class']) tst1result = percentError( trainer.testOnClassData(dataset = tst1), tst1['class']) print(" test class 0 samples: %d, error: %5.2f%%" % (len(tst0),tst0result)) print(" test class 1 samples: %d, error: %5.2f%%" % (len(tst1),tst1result)) 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687 print("\nImporting training data...")from pybrain.datasets import ClassificationDataSet#bring in data from training filetraindata = ClassificationDataSet(2,1,2)f = open("classification.tra")for line in f.readlines():    #using classification data set this time (subtracting 1 so first class is 0)    traindata.appendLinked(list(map(float, line.split()))[0:2],int(list(map(float, line.split()))[2])-1)     print("Training rows: %d " %  len(traindata) )print("Input dimensions: %d, output dimensions: %d" % ( traindata.indim, traindata.outdim))#convert to have 1 in column per classtraindata._convertToOneOfMany()#raw_input("Press Enter to view training data...")#print(traindata)print("\nFirst sample: ", traindata['input'][0], traindata['target'][0], traindata['class'][0]) print("\nCreating Neural Network:")#create the networkfrom pybrain.tools.shortcuts import buildNetworkfrom pybrain.structure.modules import SoftmaxLayer#change the number below for neurons in hidden layerhiddenneurons = 2net = buildNetwork(traindata.indim,hiddenneurons,traindata.outdim, outclass=SoftmaxLayer)print('Network Structure:')print('\nInput: ', net['in'])#can't figure out how to get hidden neuron count, so making it a variable to printprint('Hidden layer 1: ', net['hidden0'], ", Neurons: ", hiddenneurons )print('Output: ', net['out']) #raw_input("Press Enter to train network...")#train neural networkprint("\nTraining the neural network...")from pybrain.supervised.trainers import BackpropTrainertrainer = BackpropTrainer(net,traindata)trainer.trainUntilConvergence(dataset = traindata, maxEpochs=100, continueEpochs=10, verbose=True, validationProportion = .20) print("\n")for mod in net.modules:    for conn in net.connections[mod]:        print conn        for cc in range(len(conn.params)):            print conn.whichBuffers(cc), conn.params[cc] print("\nTraining Epochs: %d" % trainer.totalepochs) from pybrain.utilities import percentErrortrnresult = percentError( trainer.testOnClassData(dataset = traindata),                              traindata['class'] )print("  train error: %5.2f%%" % trnresult)#result for each classtrn0, trn1 =  traindata.splitByClass(0)trn0result = percentError( trainer.testOnClassData(dataset = trn0), trn0['class'])trn1result = percentError( trainer.testOnClassData(dataset = trn1), trn1['class'])print("  train class 0 samples: %d, error: %5.2f%%" % (len(trn0),trn0result))print("  train class 1 samples: %d, error: %5.2f%%" % (len(trn1),trn1result)) raw_input("\nPress Enter to start testing...") print("\nImporting testing data...")#bring in data from testing filetestdata = ClassificationDataSet(2,1,2)f = open("classification.tst")for line in f.readlines():    #using classification data set this time (subtracting 1 so first class is 0)    testdata.appendLinked(list(map(float, line.split()))[0:2],int(list(map(float, line.split()))[2])-1)     print("Test rows: %d " %  len(testdata) )print("Input dimensions: %d, output dimensions: %d" % ( testdata.indim, testdata.outdim))#convert to have 1 in...

Read More

Machine Learning Project 3

April 24, 2014

I’m in the midst of working on Project 3 for my Machine Learning class. This one has the following tasks: Train a 3-layer (input, hidden, output) neural network with one hidden layer based on the given training set which has 8 inputs and 7 outputs. Obtain training & testing errors with the number of hidden units set at 1, 4, and 8. Design a neural network for classification and train on the given training set with 2 inputs and 2 classes. Apply the trained network to the testing data. Let the number of hidden units be 1, 2, and 4 respectively, and obtain training and testing classification accuracies for each. Repeat task 2 on the training data set with 16 inputs and 10 classes, using hidden units of 5, 10, and 13 Repeat tasks 2 and 3 using an SVM classifier. Choose several kernel functions and parameters and report the training and testing accuracies for each. Thank goodness we’re allowed to use built-in functions this time! The prof recommended matlab, but said I could use python if I could find a good library for neural networks, so I decided to try PyBrain. I had a hard time attempting to install PyBrain because I was using Python 3.3. Realizing it was incompatible and I didn’t want to try to make the modifications necessary to get it to work with a 1-week project turnaround, I went looking for another package that could do neural networks. I tried neurolab and just couldn’t get it to work, and everywhere I read online with problems, people suggested the solution was to use PyBrain. I already had python 2.7 installed, so I configured my computer to install pybrain for 2.7 and run python 2.7 and use it in Visual Studio (my current IDE), and finally got it up and running. As of last night, I had some preliminary solutions for task 1, but I don’t fully trust the results, so I’m playing around with it a bit tonight. I do have a little more time to experiment since the due date got moved from Friday night to Monday (once I pointed out that handing out a project on Saturday of Easter weekend – when I was actually working on a major project for my other grad course Risk Analysis – and having it due the following Friday wasn’t very workable for those of us that have full time jobs, and extending it to even give one weekend day would be beneficial). So, that’s underway, and I’m actually writing this blog post while I wait for my latest neural network setup to train to 100 epochs in pybrain! I’ll update when I have some results to...

Read More

ML Project 2 (Post 2)

April 10, 2014

I finished the Linear Regression & Classification project for Machine Learning class tonight. The part that took me the longest was definitely the conversion of the C sample code to Python! I can read C OK, but have never written more than the most basic code in C, and want to learn Python as well as possible during this class anyway. (If you haven’t read my past posts, I just started learning Python on my own this semester.) The project gave us the option of using any code language, so I set out on the task of converting it, and many frustrations with loops and matrices later, I got the code below. The output doesn’t match the one from the C code exactly (off by around .1 and not consistently the same each time I run it) so it’s likely something is incorrect, but it runs and gives an almost-correct result, so I had to move on in order to finish on time. I’m not sure whether the modifications I made for the rest of the tasks in the project are correct yet (I’ll update when I get it back), but I’ve attached my code files below. You can see in the pasted code below that I was outputting at every step of the way to debug (and actually, I removed most of my many print statements to clean it up!). Now that it’s turned in, let me know if you have any recommendations for improving the code! Linear Regression converted from C Sample Code Python import numpy as np #bring in data from training file i = 0 x = [] #inputs x ty = [] #outputs ty f = open("regression.tra") for line in f.readlines(): #every other line in file is 8 input values or 7 output values if i%2 == 0: x.append(list(map(float, line.split()))) else: ty.append(list(map(float, line.split()))) i=i+1 print("TRAINING DATA") print("Length of training set: %d , %d " % (len(x), len(ty))) #print(i) #input("Press Enter to view input data...") #print('x:') #print(x) #input("Press Enter to view output data...") #print('ty:') #print(ty) #x-augmented, copy x add a column of all ones xa = np.append(x,np.ones([len(x),1]),1) print("Shape xa: " + str(np.shape(xa))) print("Shape ty: " + str(np.shape(ty))) Nin = 9 Nout = 7 #bring in data from TEST file i2 = 0 x2 = [] #inputs x ty_test = [] #outputs ty f2 = open("regression.tst") for line in f2.readlines(): #every other line in file is 8 input values, 7 output values if i2%2 == 0: x2.append(list(map(float, line.split()))) else: ty_test.append(list(map(float, line.split()))) i2=i2+1 print("\nTEST DATA") print("Length of test set: %d , %d " % (len(x2), len(ty_test))) #print(i) #input("Press Enter to view input data...") #print('x2:') #print(x2) #input("Press Enter to view output data...") #print('ty_test:') #print(ty_test) #x-augmented, copy x add a column of all ones xa_test = np.append(x2,np.ones([len(x2),1]),1) print("Shape xa_test: " + str(np.shape(xa_test))) print("Shape ty_test: " + str(np.shape(ty_test))) input("\nPress Enter to continue...") print("Calculating auto-correlation...") #auto-correlation xTx R = [[0.0 for j in range(Nin)] for i in range(Nin)] for xarow in xa: for i in range(Nin): for j in range(Nin): R[i][j] = R[i][j] + (xarow[i] * xarow[j]) print("Calculating cross-correlation...") #cross-correlation xTty C = [[0.0 for j in range(Nin)] for i in range(Nout)] for n in range(len(xa)): for i in range(Nout): for j in range(Nin): C[i][j] = C[i][j] + (ty[n][i] * xa[n][j]) #print("Shape R: " + str(np.shape(R)) + " Shape C: " + str(np.shape(C))) print("Normalizing correlations...") #normalize (1/Nv) for i in range(Nin): for j in range(Nin): R[i][j] = R[i][j]/(len(xa)) for i in range(Nout): for j in range(Nin): C[i][j] = C[i][j]/(len(ty)) meanseed = 0.0 stddevseed = 0.5 ##set up W w0 = [[0.0 for j in range(Nin-1)] for i in range(Nout)] W = [[0.0 for j in range(Nin)] for i in range(Nout)] for i in range(Nout): for j in range(Nin-1): #assign random weight for initial value w0[i][j] = np.random.normal(meanseed,stddevseed) W[i][j] = w0[i][j] W[i][Nin-1] = np.random.normal(meanseed,stddevseed) #conjugate gradient subroutine (this could be called as a function) #input("Press enter to calculate weights...") print("Calculating weights...") for i in range(Nout): #loop around CGrad in sample passiter = 0 XD = 1.0 #copying matrix parts needed w = W[i] r = R c = C[i] Nu = Nin while passiter < 2: #2 passes p = [0.0 for j in range(Nu)] g = [0.0 for j in range(Nu)] for j in range(Nu): #equivalent to "iter" loop in sample code (again, check loop values) for k in range(Nu): #equiv to l loop in sample tempg = 0.0 for m in range(Nu): tempg = tempg + w[m]*r[m][k] g[k] = -2.0*c[k] + 2.0*tempg XN = 0.0 for k in range(Nu): XN = XN + g[k] * g[k] B1 = XN / XD XD = XN for k in range(Nu): p[k] = -g[k] + B1*p[k] Den =...

Read More

ML Project 2

March 27, 2014

For the latest project, my Machine Learning professor gave us some sample code (in C) and we have to:
Convert the sample into the language we’ll be using (Python in my case) and compile & run the linear regression model on the training data, calculating the error using a function. Modify the program to…

Read More

Midterms & Project 1 Grade

March 27, 2014

I’ve been gone from the blog for a while because of midterms in my two grad classes (Risk Analysis and Machine Learning), and I was about to come back and write about an algorithm I explored that wasn’t related to one of my classes, but my Machine Learning professor went and assigned another project…

Read More