• 2 Posts
  • 724 Comments
Joined 3 years ago
cake
Cake day: July 11th, 2023

help-circle



  • Are you a native french speaker? Maybe you heard it differently from me, but while I am all for nuance, lets not sanewash people and take them at their word.

    I use plenty of software where the developers are not primarily focused on security, but his line of reasoning sounds just plain dangerous for an OS developer. Maybe he phrased it bad, but that would be up to him to clarify and we shouldnt do that for him.



  • The delimiter isn’t really the issue. Its that there are lots and lots of weird edge cases that break reading csvs. If you use commas, at minimum, you need to escape commas in the data, or quote strings that might contain commas… But now you have to deal with the possibility of a quote character or your escape character in the data.

    Then you have the fact that csvs can be written with so many different character encodings, mangling special characters where they occur.

    Aaand then you have all the issues that come with lack of metadata - good formats will at least tell you the type of data in each column so you dont have to guess them.

    Lets see, its also really annoying to include any binary data in a csv, theres no redundancy or parity checks to catch currupted data, and they arent compressed so you need to tack on compression if you want efficient storage, but that means you always have to read the whole csv file for any task.

    Oh, that brings me to the joys of modern columnar formats where you can read selected columns super fast without reading the whole file.

    Oh god, I really kept going there. Sorry. Its been a year.








  • Jason2357@lemmy.catoScience Memes@mander.xyzJust the way we likes it.
    link
    fedilink
    English
    arrow-up
    32
    arrow-down
    1
    ·
    14 days ago

    God I hate csv with the fire of a thousand suns.

    Contractors never seem to know how to write them correctly. Last year, one even provided “csv”s that were just Oracle error messages. lol. Another told me their system could not quote string columns nor escape commas or use anything but commas as their separator, so there were unpredictable numbers of commas in the rows when the actual data contained commas. Total nightmare. And so much of my data has special character issues because somewhere in the pipeline a text encoding was wrong and there is exactly one mangled character in 5 million lines for me to find.

    Give me the data as closely to the source data as you can. If it is a database, then a database dump or access to a clone of your database is the best option by far. I don’t care how obscure your shit is, Ill do the conversion myself.

    For intermediate data, something like parquet or language specific formats like Rdata or pickle files. Maaaaybe very carefully created csv files for archival purposes, but even then, I think parquet is safe for the long haul nowadays.


  • I will have to yield to your experience then. I mainly thought of it as a naive type of sensible argument, given people were not all that concerned about tracking and particularly browser fingerprinting. I guess back then, the main thing was web developers who used flash needed to check for it. But those people were anti-open web back then and deserved to be ignored by the browser makers.

    I am guessing you were strongly in the open web camp back then. I am glad we sort of won that particular battle, even if we lost so many others.






  • Save yourself the brain aneurism from watching anti-science 1990’s “perpetual motion” bullshit and look up “wood gasification (e.g., https://en.wikipedia.org/wiki/Wood_gas_generator)

    Its not some secret or “free energy”. It is just converting energy from wood in a 2 step process -instead of burning it directly for heat, you extract the flammable gas, which is a more flexible energy source (and can be used in internal combustion engines like generators). Its been around for a long time. The Nazi’s even tried to run a tank on it.

    Today, many large scale “biogas” reactors use similar mechanisms, though they typically use bacteria to produce gas rather than heat from partial combustion, which allows them to work on things that don’t burn cleanly, like garbage or sewage. I know a hobbyist that built a traditional combustion-based wood gasifier generator- it worked, but was a lot of work to keep the gas flowing and engine running. No free energy, you still need the (bio) fuel.