Skip Navigation

Large Text Compression Benchmark

https:// www.mattmahoney.net /dc/text.html
1
Jump
The empire of C++ strikes back with Safe C++ blueprint
  • The only (arguably*) baseless claim in that quote is this part:

    You do understand you're making that claim on the post discussing the proposal of Safe C++ ?

    And to underline the absurdity of your claim, would you argue that it's impossible to write a"hello, world" program in C++ that's not memory-safe? From that point onward, what would it take to make it violate any memory constraints? Are those things avoidable? Think about it for a second before saying nonsense about impossibilities.

    -2
  • Jump
    The HTTP QUERY Method
  • Custom methods won't have the benefit of being dealt with as if they shared specific semantics, such as being treated as safe methods or idempotent, but ultimately that's just an expected trait that anyone can work with.

    In the end, specifying a new standard HTTP method like QUERY extends some very specific assurances regarding semantics, such as whether frameworks should enforce CRSF tokens based on whether a QUERY has the semantics of a safe method or not.

    3
  • Jump
    The empire of C++ strikes back with Safe C++ blueprint
  • If you could reliably write memory safe code in C++, why do devs put memory safety issues intontheir code bases then?

    That's a question you can ask to the guys promoting the adoption of languages marketed based on memory safety arguments. I mean, even Rust has a fair share of CVEs whose root cause is unsafe memory management.

    -4
  • Jump
    The empire of C++ strikes back with Safe C++ blueprint
  • The problem with C++ is it still allows a lot of unsafe ways of working with memory that previous projects used and people still use now.

    Why do you think this is a problem? We have a tool that gives everyone the freedom to manage resources the way it suits their own needs. It even went as far as explicitly supporting garbage collectors right up to C++23. Some frameworks adopted and enforced their own memory management systems, such as Qt.

    Tell me, exactly why do you think this is a problem?

    -6
  • Jump
    The empire of C++ strikes back with Safe C++ blueprint
  • From the article.

    Josh Aas, co-founder and executive director of the Internet Security Research Group (ISRG), which oversees a memory safety initiative called Prossimo, last year told The Register that while it's theoretically possible to write memory-safe C++, that's not happening in real-world scenarios because C++ was not designed from the ground up for memory safety.

    That baseless claim doesn't pass the smell check. Just because a feature was not rolled out in the mid-90s would that mean that it's not available today? Utter nonsense.

    If your paycheck is highly dependent on pushing a specific tool, of course you have a vested interest in diving head-first in a denial pool.

    But cargo cult mentality is here to stay.

    1
  • Dissecting the GZIP format (2011)

    https:// www.infinitepartitions.com /art001.html
    0
    Jump
    The HTTP QUERY Method
  • However, we’re still implementing IPv6, so how long until we could actually use this?

    We can already use custom verbs as we please: we only need to have clients and servers agree on a contract.

    What we don't have is the benefit of high-level "batteries included" web frameworks doing the work for us.

    3
  • dominikberner.ch Using Conan as a CMake Dependency Provider

    With the addition of dependency providers in CMake 3.24 using Conan to manage dependencies becomes easier and more integrated. This post shows a step-by-step guide on how to use Conan as a CMake dependency provider.

    0
    ashishb.net Always support compressed response in an API service

    If you run any web service always enable support for serving compressed responses. It will save egress bandwidth costs for you. And, more importantly, for your users. Over time, the servers as well as client devices have become more powerful, so, compressing/decompressing data on the fly is cheap.

    0
    www.ietf.org The HTTP QUERY Method

    This specification defines a new HTTP method, QUERY, as a safe, idempotent request method that can carry request content.

    6
    Jump
    B-Trees: More Than I Thought I'd Want to Know
  • So that’s where I would say, as long as performance doesn’t matter it’s better to default to B-Tree maps than to hash maps, because the chance of avoiding bugs is more valuable than immeasurable performance benefits (...)

    I don't quite follow. What leads you to believe that a B-Tree map implementation would have a lower chance of having a bug when you can simply pick any standard and readily available hash map implementation?

    Also, you fail to provide any concrete reasoning for b-tree maps. It's not performance on any of the dictionary operationd, and bugs ain't it as well. What's the selling point that you are seeing?

    1
  • martin.ankerl.com Comprehensive C++ Hashmap Benchmarks 2022

    Where I've spent way too much time creating benchmarks of C++ hashmaps

    0
    Jump
    B-Trees: More Than I Thought I'd Want to Know
  • the reason I tend to recommend B-Tree maps over hash maps for ordinary programming is consistent iteration order.

    Hash maps tend to be used to take advantage of constant time lookup and insertion, not iterations. Hash maps aren't really suites for that usecase.

    Programming languages tend to provide two standard dictionary containers: a hash map implementation suited for lookups and insertions, and a tree-based hash map that supports sorting elements by key.

    2
  • 5
    bunny.net What is GZIP Compression and is it Lossless?

    GZIP Compression is an extremely popular technique of lossless compression for photos, videos, & web pages. It is used by a large number of websites.

    0
    Jump
    RFC 7493: The I-JSON Message Format
  • Yeah, the quality on Lemmy is nowhere (...)

    Go ahead and contribute things that you find interesting instead of wasting your time whining about what others might like.

    So far, all you're contributing is whiny shitposting. You can find plenty of that in Reddit too.

    1
  • Jump
    RFC 7493: The I-JSON Message Format
  • It’s from 2015, so its probably what you are doing anyway

    No, you are probably not using this at all. The problem with JSON is that this details are all handled in an implementation-defined way, and most implementation just fail/round silently.

    Just give it a try and send down the wire a JSON with, say, a huge integer, and see if that triggers a parsing error. For starters, in .NET both Newtonsoft and System.Text.Json set a limit of 64 bits.

    https://learn.microsoft.com/en-us/dotnet/api/system.text.json.jsonserializeroptions.maxdepth

    2
  • Jump
    RFC 7493: The I-JSON Message Format
  • Why restrict to 54-bit signed integers?

    Because number is a double, and IEEE754 specifies the mantissa of double-precision numbers as 53bits+sign.

    Meaning, it's the highest integer precision that a double-precision object can express.

    I suppose that makes sense for maximum compatibility, but feels gross if we’re already identifying value types.

    It's not about compatibility. It's because JSON only has a number type which covers both floating point and integers, and number is implemented as a double-precision value. If you have to express integers with a double-precision type, when you go beyond 53bits you will start to experience loss of precision, which goes completely against the notion of an integer.

    4
  • Jump
    Nagle's algorithm - Wikipedia
  • The only think that TCP_NODELAY does is disabling packet batching/merging through Naggle's algorithm. Supposedly that increases throughput by reducing the volume of redundant information required to send small data payloads in individual packets, with the tradeoff of higher latency. It's a tradeoff between latency and throughput. I don't see any reason for transfer rates to lower; quite the opposite. In fact the very few benchmarks I saw showed exactly that: TCP_NODELAY causing a drop in the transfer rate.

    There are also articles on the cargo cult behind TCP_NODELAY.

    But feel free to show your data.

    1
  • Jump
    Safe C++
  • It’s very hard for “Safe C++” to exist when integer overflow is UB.

    You could simply state you did not read the article and decided to comment out of ignorance.

    If you spent one minute skimming through the article, you would have stumbled upon the section on undefined behavior. Instead, you opted to post ignorant drivel.

    0
  • Jump
    Safe C++
  • I wouldn’t call bad readability a loaded gun really.

    Bad readability is a problem cause by the developer, not the language. Anyone can crank out unreadable symbol soup in any language, if that's what they want/can deliver.

    Blaming the programming language for the programmer's incompetence is very telling, so telling there's even a saying: A bad workman always blames his tools.

    4
  • Jump
    Safe C++
  • Well, auto looks just like var in that regard.

    It really isn't. Neither in C# nor in Java. They are just syntactic sugar to avoid redundant type specifications. I mean things like Foo foo = new Foo();. Who gets confused with that?

    Why do you think IDEs are able to tell which type a variable is?

    Even C# takes a step further and allows developer to omit the constructor with their target-typed new expressions. No one is whining about dynamic types just because the language let's you instantiate an object with Foo foo = new();.

    8
  • 7
    datatracker.ietf.org RFC 7493: The I-JSON Message Format

    I-JSON (short for "Internet JSON") is a restricted profile of JSON designed to maximize interoperability and increase confidence that software can process it successfully with predictable results.

    17
    0
    Jump
    Why Copilot is Making Programmers Worse at Programming
  • I think I could have states my opinion better. I think LLMs total value remains to be seen. They allow totally incompetent developers to occasionally pass as below average developers.

    This is a baseless assertion from your end, and a purely personal one.

    My anecdotal evidence is that the best software engineers I know use these tools extensively to get rid of churn and drudge work, and they apply it anywhere and everywhere they can.

    1
  • Conflict Resolution: Using Last-Write-Wins vs. CRDTs (2018)

    dzone.com Conflict Resolution: Using Last-Write-Wins vs. CRDTs - DZone

    Learn about two common techniques for resolving conflicts in your database: last-write-wins (LWW) and conflict-free replicated data types (CRDTs).

    0
    Jump
    Why Copilot is Making Programmers Worse at Programming
  • They existed before LLMs were spitting code like today, and this will undoubtedly lower the bar for bad developers to enter.

    If LLMs allow bad programmers to deliver work with good enough quality to pass themselves off as good programmers, this means LLMs are fantastic value for money.

    Also worth noting: programmers do learn by analysing the output of LLMs, just as the programmers of old learned by reading someone else's code.

    2
  • olano.dev My Software Bookshelf

    The easiest way to make a to-read pile grow is to read a book from it.

    0
    github.com Release 5.0.0 · expressjs/express

    What's Changed 4.19.2 Staging by @wesleytodd in #5561 remove duplicate location test for data uri by @wesleytodd in #5562 feat: document beta releases expectations by @marco-ippolito in #5565 Cut ...

    2
    blog.cloudflare.com When Bloom filters don't bloom

    Last month finally I had an opportunity to use Bloom filters. I became fascinated with the promise of this data structure, but I quickly realized it had some drawbacks.

    0