Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I’ve heard this before, but what do you get over pandas, which has groupby, filtering etc? Skip one dependency? Usage in other languages than python?


If pandas gives you a full relational API with arbitrary data then in isolation it doesn’t make much of a difference. SQL is more portable so to speak, but a library like that may introduce less friction. Pragmatism is advised here.

The big idea here is to use relational logic programming to express data transformation outside of storage access. The paper „Out of the Tar Pit“ proposed this as a way to reduce accidental complexity.


Sqlite has probably much wider (and more stable) support with other languages than pandas.

Also it is a different thing. Pandas is very nice to do data analytics or crunch numbers on reocurring data, but I wouldn't replace a database with it.


I agree that it can’t replace a database, for most use cases. But an in-memory database I’ve yet to find a «real» usecase for.


I recently started using pandas to do groupby and aggregation. It’s nice to have, but the whole time I kept wishing it would just let me run an SQL query without adding another dependency. Having learned SQL long ago, I find it to be much more intuitive and expressive. I guess it’s all just what you know.


Pandas does have read/write to sqlite. You can dump a dataframe to sqlite, transform, then load from sql again. If it’s worth it depends on your case I guess.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: