2016-05-30 6 views
-3

私はPythonの初心者です。このデータについては、私はJupytier iPythonで作業しています。私はcsvファイルからSklearnを実行するために数値データを抽出しようとしています。私が持っている:Pythonタプルからデータを抽出するにはどうすればよいですか?

が開き、CSVは、パンダでファイルの読み取り

私は私のデータをオーガンジーするように設定辞書にデータを設定し

(私のコードと値を見るためにパンダのデータフレームを設定)

は、機械学習のために、さらに、アクセシビリティのための2次元配列に自分のデータを再構成(reshaped)機械学習

ため、それにアクセスできるようにNP配列に自分のデータを設定し

データをターゲット属性に設定しようとしましたが、パンダを使用したLat_readerがタプルであることがわかりました。私は任意のタイプのMLを実行する前にタプルからデータを抽出する必要があります。基本的に私のLat-readerはタプルになり、Sklearnsはタプルを扱うことができません。タプルの内部にはデータがありますが、それをタプルから変更したいのです。ここで

は私のコードです:ここでは

import numpy as np 
import pandas as pd 


codes = {'Generation':{1:'First', 2: 'Second'}, 
    'Location':{1:'New York', 2:'Pennsylvania', 3: 'New Jersey'}, 
    'Age at Last Birthday':{}, 
    'Gender':{1:'Male', 2:'Female'}, 
    'Country':{1: 'Arg', 2:'Bol', 3:'Bra', 4:'Col', 5:'DR', 6:'Edu', 7:'El  Sal', 8:'Gua', 9:'Hon', 10:'Mex', 
       11:'Nic', 12:'Pan', 13:'Peru', 14:'PR', 15:'Ven'}, 
    'African Roots':{0:'No', 1:'Yes'}, 
    'European Roots':{0:'No', 1:'Yes'}, 
    'Indian Roots':{0:'No', 1:'Yes'}, 
    'Other Roots':{0:'No', 1:'Yes'}, 
    'Skin Color':{1:'Light',2:'Medium Light',3:'Medium',4:'Mediium Dark',5:'Dark'}, 
    'Legal Status':{1:'Documents',2:'No Documents',3:'Questionable Documents',9:'Missing'}, 
    'Reason for Migration':{1:'supply-side economics',2:'demand-side economics',3:'network links', 
          4:'violence at origin', 5:'family reasons',6:'other'}, 
    'Return Plans':{1:'Yes', 2:'No', 3:'Dont Know', 9:'Not Asked'}, 
    'Occupation':{1:'Unpaid',2:'Student',3:'Agrigulture', 4:'Unskilled Operative', 
        5:'Skilled Operative', 6:'Transport Worker',7:'Unsilled Services', 
        8:'Skilled Services',9:'Small Business',10:'Professional',11:'Retired',99:'Unknown'}, 
    'Wage in US Dollars':{88:'Not Appliable',99:'Unknown'}, 
    'Hours Worked':{ 88:'Not Appliable', 99:'Unknown'}, 
    'Identity':{1:'Latino', 2:'American', 3:'Both',99:'Unknown'}, 
    'Latino Identity Among Immigrants':{1:'Yes', 2:'No', 3:'Yes-No', 4: 'Dont Know', 9:'Missing'}, 
    'Reasons for Latino Identity':{1:'Yes',2:'No', 9:'Unknown'}, 
    'With Whom Gets Together':{1:'Yes', 2:'No', 9:'Unknown'}, 
    'USYrs':{88:'Not Appliable',99:'Mising'}, 
    'In Contact With Home':{1:'Yes',2:'No',9:'Unknown'}, 
    'R Send Money Home':{1:'Yes',2:'No',3:'Send Other', 9:'Unknown'}, 
    'Parent Send Money':{1:'Yes',2:'No',3:'Not Appliable',4:'Unknown'}, 
    'Quantity Sent by Respondent or Parent':{1:'Half of Paycheck',2:'20% of paycheck',3:'Varies month to month'}, 
    'How Money Sent':{1:'Moneygram',2:'Paisano',3:'Friend',4:'Self',5:'Bank', 6:'Moneygram and Paisano', 
         7:'Moneygram and Friend'}, 
    'Frequency Money Sent':{1:'Once a month', 2:'Twice a year',3:'Once a year',4:'Once in a while',5:'Holidays'}, 
    'How Money Used':{0:'No Use',1:'Buy House',2:'Family Expenses',3:'Health',4:'Education',5:'Savings',6:'Pay a Debt'}, 
    'Bank in US':{1:'Yes',2:'No',3:'Unknown'}, 
    'Bank Overseas':{1:'Yes',2:'No',3:'Unknown'}, 
    'Type of Communication':{1:'Land Phone',2:'Cell Phone',3:'Calling Card',4:'Email',5:'Regular Mail', 
           6:'No Communication', 9:'Unknown'}, 
    'Presents Sent':{1:'Yes',2:'No', 3:'Unknown'}, 
    'Education':{}, 
    'EngAbli':{0:'None',1:'Some English',2:'Good English',9:'Missing'}, 
    'EconOpp':{1: 'More in the US',2:'More at Origin',3:'Same at Both', 9:'Missing'}, 
    'OthOpps':{0:'Just Earnings',1:'Personal',2:'Work',3:'Study'}, 
    'Inequality':{1:'More at Origin',2:'More in the US',3:'Same in Both', 9:'Missing'}, 
    'Discrim':{1:'Yes',0:'No',9:'Missing'}, 
    'Context':{1:'Work/School',2:'On The Street',3:'Language',4:'Race/Ethnicity',5:'Medical',6:'Violence', 
       7:'Poverty',8:'Other',9:'Missing'}} 

pd.DataFrame(codes.items(), columns =['Codes', 'Values']) 



Lat_pro = open('Identity.Codes.Datafile.csv') 


Lat_reader = (pd.read_csv(Lat_pro), ',') 

np.array(Lat_reader) 


newLat_reader = np.reshape(A, (202,73)) 


print newLat_reader 

は、出力のサンプルです:

         Unnamed: 0 Unnamed: 1 \ 
0          Subject Code  Gen 
1            F-001   1 
2            F-002   1 
3            F-003   1 
4            F-007   1 
5            F-008   1 
6            F-010   1 
7            F-013   1 
8            F-014   1 
9            F-015   1 
10            F-016   1 
11            F-017   1 
12            F-018   1 
13            F-019   1 
14            F-020   1 
15            F-021   1 
16            F-022   1 
17            F-024   1 
18            F-025   1 
19            F-026   1 
20            F-027   1 
21            F-028   1 
22            F-032   1 
23            F-033   1 
24            F-034   1 
25            F-035   1 
26            F-036   1 
27            F-037   1 
28            F-038   1 
29            F-039   1 
..            ...  ... 
172          Legal Status  NaN 
173        Reason for Migration  NaN 
174          Return Plans  NaN 
175          Occupation  NaN 
176            NaN  NaN 
177            Wage  NaN 
178          Hours Worked  NaN 
179           Identity  NaN 
180     Latino Identity Among Immigrants  NaN 
181      Reasons for Latino Identity  NaN 
182       With Whom Gets Together  NaN 
183            USYrs  NaN 
184     In Contact with Home Community  NaN 
185        R Sends Money Home  NaN 
186 Parent Sends Money Home (Second Generation Only)  NaN 
187    Quantity Sent by Respondent or Parent  NaN 
188         How Money Sent  NaN 
189        Frequency Money Sent  NaN 
190         How Money Used  NaN 
191          Bank in US  NaN 
192          Bank Overseas  NaN 
193        Type of Communication  NaN 
194         Presents Sent   NaN 
195           Education  NaN 
196           EngAbil  NaN 
197           EconOpps  NaN 
198           OthOpps  NaN 
199          Inequality  NaN 
200           Discrim  NaN 
201           Context  NaN 

             Unnamed: 2 Unnamed: 3 Unnamed: 4 \ 
0            Place  Age  Male 
1             1   28   1 
2             2   35   1 
3             1   30   0 
4             3   19   1 
5             3   20   1 
6             2   21   0 
7             3   29   1 
8             1   25   1 
9             3   23   1 
10             3   30   0 
11             3   21   0 
12             3   23   1 
13             3   34   1 
14             3   33   1 
15             3   33   0 
16             3   33   1 
17             3   26   1 
18             2   31   1 
19             3   31   0 
20             3   20   1 
21             1   20   0 
22             3   22   1 
23             1   20   1 
24             3   30   0 
25             3   22   1 
26             3   26   0 
27             3   25   1 
28             1   19   0 
29             3   21   1 
..             ...  ...  ... 
172 1=Documents 2=No Documents 3=Questionable ...  NaN  NaN 
173 1=supply-side economics 2=demand-side econom...  NaN  NaN 
174 1=Yes 2=No 3=Don't Know 4=No Answer 9=...  NaN  NaN 
175 1=Unpaid 2=Student 3=Agrigulture 4=Unskille...  NaN  NaN 
176 7=Unsilled Services 8=Skilled Services 9=Sma...  NaN  NaN 
177 Wage in U.S. Dollars; 88=Not applicable; 99=...  NaN  NaN 
178  Hours Worked; 88=Not Applicable; 99=Unknown  NaN  NaN 
179   1=Latino 2=American 3=Both 9=Unknown  NaN  NaN 
180  1=Yes 2=No 3=Yes-No 4=Don't Know 9=Missing  NaN  NaN 
181       1=Yes 0=No 9=Unknown  NaN  NaN 
182       1=Yes 0=No 9=Unknown  NaN  NaN 
183 Number of Years in US; 88=Not Applicable; 99 M...  NaN  NaN 
184       1=Yes 0=No 9=Unknown   NaN  NaN 
185    1=Yes 2=No 3=Send Other 9=Unknown  NaN  NaN 
186   1=Yes 2=No 8=Not Applicable 9=Unknown  NaN  NaN 
187 1=Half of Paycheck 2=20% of Paycheck 3=Varie...  NaN  NaN 
188 1=Moneygram 2=Paisano 3=Friend 4=Self ...  NaN  NaN 
189 1=Once a Month 2=Twice a Year 3=Once a Yea...  NaN  NaN 
190 0=No Use 1=Buy House 2=Family Expenses 3=...  NaN  NaN 
191      1=Yes 2=No 9=Unknown   NaN  NaN 
192       1=Yes 2=No 9=Unknown  NaN  NaN 
193 1=Land Phone 2=Cell Phone 3=Calling Card ...  NaN  NaN 
194       1=Yes 2=No 9=Unknown  NaN  NaN 
195           In Years  NaN  NaN 
196 0=None 1=Some English 2=Good English 9=Mi...  NaN  NaN 
197 1=More in US 2=More at Origin 3=Same at Bot...  NaN  NaN 
198 0=Just Earnings 1=Personal 2=Work 3=Study ...  NaN  NaN 
199 1=More at Origin 2=More in US 3=Same in Bot...  NaN  NaN 
200       1=Yes 0=No 9=Missing  NaN  NaN 
201 1=Work/School 2=On Street 3=Language 4=Rac...  NaN  NaN 
+0

タプルからデータを抽出することはどういう意味ですか?おそらく、[いくつかの質問の提言](https://stackoverflow.com/help/how-to-ask)と[良いパンダの例](https://stackoverflow.com/questions/20109391)を見てみたいと思います。/how-to-make-good-reproducible-pandas-examples) – Luis

+0

タプルがxで、タプルの新しいリストがyの場合、...... 'y = list(x)'それです –

答えて

1

あなたは、基本的なPythonの構文を意味する場合は、あなたが最初にそこに行く必要があります。ここで

https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences

いくつかの例は以下のとおりです。

data = ('first', 'second', 'third', 'fourth')  
print "count is ", len(data) 
print "accessing by array notation: ", data[0] 
print "for loop:" 
for x in data: 
    print x 

リストにタプルを変換するには(あなたはタプル以外の何かをしたいから、私はリストが動作するかもしれない考え出し)

print [x for x in data] 
関連する問題