How to put back a number-based index

alex wright wrightalexw at gmail.com
Sat May 14 10:43:52 EDT 2016


I have recently been going through "Data Science From Scratch" which may be
interesting.  There is a podcast with the author on talk python to me.

https://talkpython.fm/episodes/show/56/data-science-from-scratch

On Sat, May 14, 2016 at 10:33 AM, Michael Selik <michael.selik at gmail.com>
wrote:

> You might also be interested in "Python for Data Analysis" for a thorough
> discussion of Pandas.
> http://shop.oreilly.com/product/0636920023784.do
>
> On Sat, May 14, 2016 at 10:29 AM Michael Selik <michael.selik at gmail.com>
> wrote:
>
> > David, it sounds like you'll need a thorough introduction to the basics
> of
> > Python.
> > Check out the tutorial: https://docs.python.org/3/tutorial/
> >
> > On Sat, May 14, 2016 at 6:19 AM David Shi <davidgshi at yahoo.co.uk> wrote:
> >
> >> Hello, Michael,
> >>
> >> I discovered that the problem is "two columns of data are put together"
> >> and "are recognised as one column".
> >>
> >> This is very strange.  I would like to understand the subject well.
> >>
> >> And, how many ways are there to investigate into the nature of objects
> >> dynamically?
> >>
> >> Some object types only get shown as an object.  Are there anything to be
> >> typed in Python, to reveal objects.
> >>
> >> Regards.
> >>
> >> David
> >>
> >>
> >> On Saturday, 14 May 2016, 4:30, Michael Selik <michael.selik at gmail.com>
> >> wrote:
> >>
> >>
> >> What were you hoping to get from ``df[0]``?
> >> When you say it "yields nothing" do you mean it raised an error? What
> was
> >> the error message?
> >>
> >> Have you tried a Google search for "pandas set index"?
> >>
> >>
> http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.set_index.html
> >>
> >> On Fri, May 13, 2016 at 11:18 PM David Shi <davidgshi at yahoo.co.uk>
> wrote:
> >>
> >> Hello, Michael,
> >>
> >> I tried to discover the problem.
> >>
> >> df[0]   yields nothing
> >> df[1]  yields nothing
> >> df[2] yields nothing
> >>
> >> However, df[3] gives the following:
> >>
> >> sid
> >> -9223372036854775808          NaN
> >>  1                      133738.70
> >>  4                      295256.11
> >>  5                      137733.09
> >>  6                      409413.58
> >>  8                      269600.97
> >>  9                       12852.94
> >>
> >>
> >> Can we split this back to normal?  or turn it into a dictionary, so
> that I can put values back properly.
> >>
> >>
> >> I like to use sid as index, some way.
> >>
> >>
> >> Regards.
> >>
> >>
> >> David
> >>
> >>
> >>
> >> On Friday, 13 May 2016, 22:58, Michael Selik <michael.selik at gmail.com>
> >> wrote:
> >>
> >>
> >> What have code you tried? What error message are you receiving?
> >>
> >> On Fri, May 13, 2016, 5:54 PM David Shi <davidgshi at yahoo.co.uk> wrote:
> >>
> >> Hello, Michael,
> >>
> >> How to convert a float type column into an integer or label or string
> >> type?
> >>
> >>
> >> On Friday, 13 May 2016, 22:02, Michael Selik <michael.selik at gmail.com>
> >> wrote:
> >>
> >>
> >> To clarify that you're specifying the index as a label, use df.iloc
> >>
> >>     >>> df = pd.DataFrame({'X': range(4)}, index=list('abcd'))
> >>     >>> df
> >>        X
> >>     a  0
> >>     b  1
> >>     c  2
> >>     d  3
> >>     >>> df.loc['a']
> >>     X    0
> >>     Name: a, dtype: int64
> >>     >>> df.iloc[0]
> >>     X    0
> >>     Name: a, dtype: int64
> >>
> >> On Fri, May 13, 2016 at 4:54 PM David Shi <davidgshi at yahoo.co.uk>
> wrote:
> >>
> >> Dear Michael,
> >>
> >> To avoid complication, I only groupby using one column.
> >>
> >> It is OK now.  But, how to refer to new row index?  How do I use
> floating
> >> index?
> >>
> >> Float64Index([ 1.0,  4.0,  5.0,  6.0,  8.0,  9.0, 10.0, 11.0, 12.0,
> 13.0, 16.0,
> >>               17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 25.0,
> 26.0, 27.0,
> >>               28.0, 29.0, 30.0, 31.0, 32.0, 33.0, 34.0, 35.0, 36.0,
> 37.0, 38.0,
> >>               39.0, 40.0, 41.0, 42.0, 44.0, 45.0, 46.0, 47.0, 48.0,
> 49.0, 50.0,
> >>               51.0, 53.0, 54.0, 55.0, 56.0],
> >>              dtype='float64', name=u'StateFIPS')
> >>
> >>
> >> Regards.
> >>
> >>
> >> David
> >>
> >>
> >>
> >> On Friday, 13 May 2016, 21:43, Michael Selik <michael.selik at gmail.com>
> >> wrote:
> >>
> >>
> >> Here's an example.
> >>
> >>     >>> import pandas as pd
> >>     >>> df = pd.DataFrame({'group': list('AB') * 2, 'data': range(4)},
> >> index=list('wxyz'))
> >>     >>> df
> >>        data group
> >>     w     0     A
> >>     x     1     B
> >>     y     2     A
> >>     z     3     B
> >>     >>> df = df.reset_index()
> >>     >>> df
> >>       index  data group
> >>     0     w     0     A
> >>     1     x     1     B
> >>     2     y     2     A
> >>     3     z     3     B
> >>     >>> df.groupby('group').max()
> >>           index  data
> >>     group
> >>     A         y     2
> >>     B         z     3
> >>
> >> If that doesn't help, you'll need to explain what you're trying to
> >> accomplish in detail -- what variables you started with, what
> >> transformations you want to do, and what variables you hope to have when
> >> finished.
> >>
> >> On Fri, May 13, 2016 at 4:36 PM David Shi <davidgshi at yahoo.co.uk>
> wrote:
> >>
> >> Hello, Michael,
> >>
> >> I changed groupby with one column.
> >>
> >> The index is different.
> >>
> >> Index([   u'AL',    u'AR',    u'AZ',    u'CA',    u'CO',    u'CT',
> u'DC',
> >>           u'DE',    u'FL',    u'GA',    u'IA',    u'ID',    u'IL',
> u'IN',
> >>           u'KS',    u'KY',    u'LA',    u'MA',    u'MD',    u'ME',
> u'MI',
> >>           u'MN',    u'MO',    u'MS',    u'MT',    u'NC',    u'ND',
> u'NE',
> >>           u'NH',    u'NJ',    u'NM',    u'NV',    u'NY',    u'OH',
> u'OK',
> >>           u'OR',    u'PA',    u'RI',    u'SC',    u'SD', u'State',
> u'TN',
> >>           u'TX',    u'UT',    u'VA',    u'VT',    u'WA',    u'WI',
> u'WV',
> >>           u'WY'],
> >>       dtype='object', name=0)
> >>
> >>
> >> How to use this index?
> >>
> >>
> >> Regards.
> >>
> >>
> >> David
> >>
> >>
> >>
> >> On Friday, 13 May 2016, 21:19, David Shi <davidgshi at yahoo.co.uk> wrote:
> >>
> >>
> >> Hello, Michael,
> >>
> >> I typed in df.index
> >>
> >> I got the following
> >>
> >> MultiIndex(levels=[[1.0, 4.0, 5.0, 6.0, 8.0, 9.0, 10.0, 11.0, 12.0,
> 13.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 25.0, 26.0,
> 27.0, 28.0, 29.0, 30.0, 31.0, 32.0, 33.0, 34.0, 35.0, 36.0, 37.0, 38.0,
> 39.0, 40.0, 41.0, 42.0, 44.0, 45.0, 46.0, 47.0, 48.0, 49.0, 50.0, 51.0,
> 53.0, 54.0, 55.0, 56.0], [u'AL', u'AR', u'AZ', u'CA', u'CO', u'CT', u'DC',
> u'DE', u'FL', u'GA', u'IA', u'ID', u'IL', u'IN', u'KS', u'KY', u'LA',
> u'MA', u'MD', u'ME', u'MI', u'MN', u'MO', u'MS', u'MT', u'NC', u'ND',
> u'NE', u'NH', u'NJ', u'NM', u'NV', u'NY', u'OH', u'OK', u'OR', u'PA',
> u'RI', u'SC', u'SD', u'State', u'TN', u'TX', u'UT', u'VA', u'VT', u'WA',
> u'WI', u'WV', u'WY']],
> >>            labels=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
> 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
> 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48], [0, 2, 1, 3,
> 4, 5, 7, 6, 8, 9, 11, 12, 13, 10, 14, 15, 16, 19, 18, 17, 20, 21, 23, 22,
> 24, 27, 31, 28, 29, 30, 32, 25, 26, 33, 34, 35, 36, 37, 38, 39, 41, 42, 43,
> 45, 44, 46, 48, 47, 49]],
> >>            names=[u'StateFIPS', 0])
> >>
> >> Regards.
> >>
> >>
> >> David
> >>
> >>
> >>
> >> On Friday, 13 May 2016, 21:11, David Shi <davidgshi at yahoo.co.uk> wrote:
> >>
> >>
> >> Dear Michael,
> >>
> >> I have done a number of operation in between.
> >>
> >> Providing that information does not help you
> >>
> >> How to reset index after grouping and various operations is of interest.
> >>
> >> How to type in a command to find out its current dataframe?
> >>
> >> Regards.
> >>
> >> David
> >>
> >>
> >> On Friday, 13 May 2016, 20:58, Michael Selik <michael.selik at gmail.com>
> >> wrote:
> >>
> >>
> >> Just in case I misunderstood, why don't you make a little example of
> >> before and after the grouping? This mailing list does not accept
> >> attachments, so you'll have to make do with pasting a few rows of
> >> comma-separated or tab-separated values.
> >>
> >> On Fri, May 13, 2016 at 3:56 PM Michael Selik <michael.selik at gmail.com>
> >> wrote:
> >>
> >> In order to preserve your index after the aggregation, you need to make
> >> sure it is considered a data column (via reset_index) and then choose
> how
> >> your aggregation will operate on that column.
> >>
> >> On Fri, May 13, 2016 at 3:29 PM David Shi <davidgshi at yahoo.co.uk>
> wrote:
> >>
> >> Hello, Michael,
> >>
> >> Why reset_index before grouping?
> >>
> >> Regards.
> >>
> >> David
> >>
> >>
> >> On Friday, 13 May 2016, 17:57, Michael Selik <michael.selik at gmail.com>
> >> wrote:
> >>
> >>
> >>
> >>
> >> On Fri, May 13, 2016 at 12:27 PM David Shi via Python-list <
> >> python-list at python.org> wrote:
> >>
> >> I lost my indexes after grouping in Pandas.
> >> I managed to rest_index and got back the index column.
> >> But How can I get back a index row?
> >>
> >>
> >> Was the grouping an aggregation? If so, the original indexes are
> >> meaningless. What you could do is reset_index before the grouping and
> when
> >> you aggregate decide how to handle the formerly-known-as-index column
> (min,
> >> max, mean, ?).
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> --
> https://mail.python.org/mailman/listinfo/python-list
>



-- 
"On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the
machine wrong figures, will the right answers come out?' I am not able
rightly
to apprehend the kind of confusion of ideas that could provoke such a
question."

-Charles Babbage, 19th century English mathematician, philosopher, inventor
and mechanical engineer who originated the concept of a programmable
computer.



More information about the Python-list mailing list