Yes, this is a good place to post errors and typos and this is much appreciated.
You are right. The docstring was missing an ‘r’ in front of the string, to indicate that it should be read as a “raw string”. Otherwise those backslashes are interpreted as special characters.
I got a warning when executing the following line of code:
df['population'] = df['population'] * 1e3
Warning:
/Users/billtubbs/anaconda/lib/python2.7/site-packages/ipykernel/__main__.py:3: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
/Users/billtubbs/anaconda/lib/python2.7/site-packages/ipykernel/main.py:2: FutureWarning: by argument to sort_index is deprecated, pls use .sort_values(by=…)
from ipykernel import kernelapp as app
When I enter the following into an iPython notebook:
%load_ext cythonmagic
I get the warning:
/Users/billtubbs/anaconda/lib/python2.7/site-packages/IPython/extensions/cythonmagic.py:21: UserWarning: The Cython magic has been moved to the Cython package
warnings.warn(“”“The Cython magic has been moved to the Cython package”“”)
I think something has been changed since you wrote the guide. The code still works. I just get this warning the first time.
It seems that contrary to usual Python slicing for array types (which took a bit of getting used to) , Pandas’ own indexing functions do select the end point (e.g. the last row). An example in the lectures can be found here:
Hi Sebastiaan. Thanks for your feedback. You make a good point.
We will actually need to do a review of this lecture soon regarding the .ix operator as Pandas is looking to deprecate this way of indexing in favour of .iloc and .loc in version 0.20. I understand that the .ix operator has been a source of bugs for pandas users and they no longer want to support it.
Thank you both for your replies. The link to the GitHub discussion is quite useful, because starting here it is also addressed how one would use mixed indexing without ix. From that I get that instead of
df.ix[2:5, ['country', 'tcgdp']]
one could write
df.loc[df.index[2:5], ['country', 'tcgdp']]
to get the expected behavior: 5 - 2 = 3 rows with row 5 not included. (The alternative via df.iloc may not work so well with multiple columns because the df.columns.get_loc method apparently does not seem to accept lists, but I may be wrong here.)
In any case, thank you again for the informative link.