[scikit-learn] R user trying to learn Python

C W tmrsg11 at gmail.com
Tue Jun 20 01:20:41 EDT 2017


I am catching up to all the replies, apologies for the delay. (replied in
reverse order)

@ Gaël,
Thanks for your comments. I actually started with 1) Data Camp courses and
2) Python for Data Science book.

Here's my review:
1) The course: it is fantastic! But they only give you a flavor of A FEW
things.
2) The book: it is quick crash course, but not enough for you to take off.
See code below.

# Toy Python Code
import numpy as np
import pandas as pd

N = 100
df = pd.DataFrame({
    'A': pd.date_range(start='2016-01-01',periods=N,freq='D'),
    'x': np.linspace(0,stop=N-1,num=N),
    'y': np.random.rand(N),
    'C': np.random.choice(['Low','Medium','High'],N).tolist(),
    'D': np.random.normal(100, 10, size=(N)).tolist()
    })
df.x
len(dir(df))
# end of Python code

My confusion:
a) df.x gives you column x, but why, I thought things after dot are
actions, or more like verbs performed on the object, namely df, in this
case.
b) len(dir(df)) gives 431. I only crated a dataframe, where did all these
431 things come from? Is there a documentation about this? It scares me
because I only asked for a dataframe.

@ Gael
This is a pretty solid reference. It explains methods among other things,
which is awesome! I think method is the barrier to entry for R users.

@ Mail
Thanks for the details, I will try to pick these computer science
terminologies up. It has been a brutal week.

@Massimo
Yes, I have used that. It is indeed great for one to one equivalence
reference.

Thanks!





On Tue, Jun 20, 2017 at 12:32 AM, Gaël Pegliasco via scikit-learn <
scikit-learn at python.org> wrote:

> And, answering your last question, a good way to learn Data science using
> Python is, for I, "Python data science handbook" that you can read as
> Jupyter notebooks:
>
> https://github.com/jakevdp/PythonDataScienceHandbook
>
>
> Le 20/06/2017 à 06:28, Gaël Pegliasco via scikit-learn a écrit :
>
> Hi,
>
> You may find these R/Python comparison-sheets useful in understanding both
> languages syntaxes and concepts:
>
>
>    - https://www.datacamp.com/community/tutorials/r-or-
>    python-for-data-analysis
>    - http://pandas.pydata.org/pandas-docs/stable/comparison_with_r.html
>
>
> Gaël,
>
> Le 18/06/2017 à 18:02, C W a écrit :
>
> Dear Scikit-learn,
>
> What are some good ways and resources to learn Python for data analysis?
>
> I am extremely frustrated using this thing. Everything comes after a dot!
> Why would you type the sam thing at the beginning of every line. It's not
> efficient.
>
> code 1:
> y_sin = np.sin(x)
> y_cos = np.cos(x)
>
> I know you can import the entire package without the "as np", but I see
> np.something as the standard. Why?
>
> Code 2:
> model = LogisticRegression()
> model.fit(X_train, y_train)
> model.score(X_test, y_test)
>
> In R, everything is saved to a variable. In the code above, what if I
> accidentally ran model.fit(), I would not know.
>
> Code 3:
> from sklearn import linear_model
> reg = linear_model.Ridge (alpha = .5)
> reg.fit ([[0, 0], [0, 0], [1, 1]], [0, .1, 1])
>
> In the code above, sklearn > linear_model > Ridge, one lives inside the
> other, it feels that there are multiple layer, how deep do I have to dig in?
>
> Can someone explain the mentality behind this setup?
>
> Thank you very much!
>
> M
>
>
> _______________________________________________
> scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn
>
>
> --
> [image: Makina Corpus] <http://makina-corpus.com>
> Newsletters <http://makina-corpus.com/formulaires/newsletters> |
> Formations <http://makina-corpus.com/formation> | Twitter
> <https://twitter.com/makina_corpus>
> Gaël Pegliasco
> Chef de projets
> Tél : 02 51 79 80 84
> Portable : 06 41 69 16 09
> 11 rue du Marchix FR-44000 Nantes
> --
> @GPegliasco <https://twitter.com/GPegliasco>
> --
> Découvrez Talend Data Integration
> <http://makina-corpus.com/formation/etl-talend-open-studio>, LA solution
> d'intégration de données Open Source
>
>
> _______________________________________________
> scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn
>
>
> --
> [image: Makina Corpus] <http://makina-corpus.com>
> Newsletters <http://makina-corpus.com/formulaires/newsletters> |
> Formations <http://makina-corpus.com/formation> | Twitter
> <https://twitter.com/makina_corpus>
> Gaël Pegliasco
> Chef de projets
> Tél : 02 51 79 80 84
> Portable : 06 41 69 16 09
> 11 rue du Marchix FR-44000 Nantes
> --
> @GPegliasco <https://twitter.com/GPegliasco>
> --
> Découvrez Talend Data Integration
> <http://makina-corpus.com/formation/etl-talend-open-studio>, LA solution
> d'intégration de données Open Source
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170620/d5e34792/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: demfflofhelojfjn.png
Type: image/png
Size: 6215 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170620/d5e34792/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bckoegajjgeobgik.png
Type: image/png
Size: 6215 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170620/d5e34792/attachment-0003.png>


More information about the scikit-learn mailing list