What’s your take on parquet?

I’m still reading into it. Why is it closely related to apache? Does inly apache push it? Meaning, if apache drops it, there’d be no interest from others to push it further?

It’s published under apache hadoop license. It is a permissive license. Is there a drawback to the license?

Do you use it? When?

I assume for sharing small data, csv is sufficient. Also, I assume csv is more accessible than parquet.

  • mvirts@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    2 months ago

    Parquet 4 eva

    Csv is for arcane software or if you don’t know where it’s going.

    Hdf5 is for Matlab interoperability

    Otherwise I use parquet (orc could also work, but I never actually use it). Sometimes parquet has problems with Pandas or polars but I’ve always been able to fix it by using pyarrow