I’ve heard this before, but what do you get over pandas, which has groupby, filt...

dgb23 · on June 20, 2021

If pandas gives you a full relational API with arbitrary data then in isolation it doesn’t make much of a difference. SQL is more portable so to speak, but a library like that may introduce less friction. Pragmatism is advised here.

The big idea here is to use relational logic programming to express data transformation outside of storage access. The paper „Out of the Tar Pit“ proposed this as a way to reduce accidental complexity.

atoav · on June 20, 2021

Sqlite has probably much wider (and more stable) support with other languages than pandas.

Also it is a different thing. Pandas is very nice to do data analytics or crunch numbers on reocurring data, but I wouldn't replace a database with it.

pletnes · on June 20, 2021

I agree that it can’t replace a database, for most use cases. But an in-memory database I’ve yet to find a «real» usecase for.

avidphantasm · on June 20, 2021

I recently started using pandas to do groupby and aggregation. It’s nice to have, but the whole time I kept wishing it would just let me run an SQL query without adding another dependency. Having learned SQL long ago, I find it to be much more intuitive and expressive. I guess it’s all just what you know.

pletnes · on June 20, 2021

Pandas does have read/write to sqlite. You can dump a dataframe to sqlite, transform, then load from sql again. If it’s worth it depends on your case I guess.