https://www.liquidpoker.net/


LP international Poland    Contact            Users: 441 Active, 2 Logged in - Time: 16:43

NEED PYTHON SKILLS

New to LiquidPoker? Register here for free!
Forum Index > Poker Blogs
Into Infinity   United States. Nov 30 2016 01:41. Posts 1884
lets say i have an .xlsx sheet which i load into pandas with a merged cell

A W some merged value*
A X some merged value*
A Y some merged value*
A Z

(pretend the *'s are all one big merged cell)


it will show up in pandas as this:

A W some merged value
A X nan
A Y nan
A Z nan

so i type this:

df['col3'] = pd.Series(df['col3']).fillna(method='ffill')

and i get this:

A W some merged value
A X some merged value
A Y some merged value
A Z some merged value


not what i want because row 4, column 3 was blank and not part of the merged cell.

how do i get around this?

0 votes
Facebook Twitter

Rinny   United States. Dec 01 2016 01:51. Posts 600

Pretty specialized tech your looking for I'd try odesk


Joe   Czech Republic. Dec 01 2016 20:17. Posts 5987

Apparently you are trying to load the 3rd column as a numeric column, but in rows 2, 3 and 4 there are values that can't be converted into a number. Blank cell is not a number, that is obvious. Then you use fillna with forward fill method, so your last valid value gets copied into all the 3 cells under it because they were evaluated as NaN after reading the .xlsx and conversion.

Show example of the real data if you need a more specific answer.

there is a light at the end of the tunnel... (but sometimes the tunnel is long and deep as hell)Last edit: 01/12/2016 20:18

Into Infinity   United States. Dec 01 2016 23:12. Posts 1884

 Last edit: 01/12/2016 23:16

Joe   Czech Republic. Dec 03 2016 00:41. Posts 5987

Well that is not a pandas problem though. In excel the merged cell is in reality just the first cell being visually extended over those other cells. Those other cells remain blank (or more exactly, they keep their original content, but they are not visible).

So when you are trying to import, the values in the cells in "Into Infinity Note" column are (from top to bottom):
Old Version
(blank)
(blank)
(blank)
blank)

So there is absolutely no difference between the 2nd plus 3rd and the 4th or 5th cell in that column. You either interpret all of them or neither of them as NaN.

Your best solution is to write a macro for Excel that will unmerge (split) the merged cells and copy the value of the merged cells into each of them. Or if you have just small data, then do it by hand. And don't use merged cells in the future.

Here is a link to a stackoverflow topic about how to create said macro:
http://stackoverflow.com/questions/6464265/how-to-keep-value-of-merged-cells-in-each-cell

A Python solution would involve writing your own xlsx reader, that could extract the info about merged cells from the format.

there is a light at the end of the tunnel... (but sometimes the tunnel is long and deep as hell)Last edit: 03/12/2016 00:45

Into Infinity   United States. Dec 03 2016 05:06. Posts 1884

thanks, yep i hate merged cells. was doing work with someone's else data. figured i would try to do it in python this is a small part of a bigger program and wanted to keep it all in one thing

 Last edit: 03/12/2016 05:08

 



Poker Streams

















Copyright © 2024. LiquidPoker.net All Rights Reserved
Contact Advertise Sitemap