python - Selecting multiple columns in a pandas dataframe
itemprop="text">
I have data in different columns but I
don't know how to extract it to save it in another
variable.
index a b c
1
2 3 4
2 3 4
5
How do I
select 'a'
, 'b'
and save it in to
df1?
I tried
df1 = df['a':'b']
df1 =
df.ix[:, 'a':'b']
None
seem to work.
itemprop="text">
The column
names (which are strings) cannot be sliced in the manner you
tried.
Here you have a couple of options. If you
know from context which variables you want to slice out, you can just return a view of
only those columns by passing a list into the __getitem__
syntax (the []'s).
df1 =
df[['a','b']]
Alternatively,
if it matters to index them numerically and not by their name (say your code should
automatically do this without knowing the names of the first two columns) then you can
do this instead:
df1 =
df.iloc[:,0:2] # Remember that Python does not slice inclusive of the ending
index.
Additionally,
you should familiarize yourself with the idea of a view into a Pandas object vs. a copy
of that object. The first of the above methods will return a new copy in memory of the
desired sub-object (the desired slices).
Sometimes, however, there are indexing
conventions in Pandas that don't do this and instead give you a new variable that just
refers to the same chunk of memory as the sub-object or slice in the original object.
This will happen with the second way of indexing, so you can modify it with the
copy()
function to get a regular copy. When this happens,
changing what you think is the sliced object can sometimes alter the original object.
Always good to be on the look out for
this.
df1 = df.iloc[0,0:2].copy()
# To avoid the case where changing df1 also changes
df
To use
iloc
, you need to know the column positions (or indices). As
the column positions may change, instead of hard-coding indices, you can use
iloc
along with get_loc
function of
columns
method of dataframe object to obtain column
indices.
{df.columns.get_loc(c):c
for idx, c in
enumerate(df.columns)}
Now
you can use this dictionary to access columns through names and using
iloc
.
No comments:
Post a Comment