Re: Introduction - Sam Lewis

From: Date: Sun, 09 Feb 2025 13:45:00 +0000
Subject: Re: Introduction - Sam Lewis
References: 1 2 3 4 5 6 7  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
Hi

On 2/8/25 20:52, Eugene Sidelnyk wrote:
Maybe there could be another good feature to start with - the function to format bytes into a human-readable format (for debug purpuses) to have pretty view of the size (like 1.5GB, or 20MB) […] This is too little thing for having separate composer library, and perhaps having it built-in would be better?
On a surface level such a function would appear to be very simple, but the “human-readable format” bit alone already raises multiple questions, with the most notable one being: What is human-readable? There's different languages in the world and they all use different decimal separators and group digits differently. And they possibly have different rules with regard to whether or not a space is required between the scalar part and the unit. In fact your example already is *incorrect* English, because English requires a space between the scalar part and the unit. And it should ideally be a non-breaking space. So in English it would be: - 1.5 GB - 20 MB In German we use the comma as the decimal separator (and also a space before the unit), so it would need to be: - 1,5 GB - 20 MB Then there's also the question of whether to use the binary scale or the decimal scale. In other words: Should 1460 Bytes be rendered as 1.5 kB or as 1.4 KiB? Also: How many decimal digits should be printed? Should it even be a fixed number of decimal digits, or are we rather interested in a total number of significant digits? In more explicit terms: For 2 *decimal* digits: 1234 Bytes -> 1.23 kB 12345 Bytes -> 12.35 kB 123456 Bytes -> 123.46 kB For 3 *significant* digits: 1234 Bytes -> 1.23 kB 12345 Bytes -> 12.3 kB 123456 Bytes -> 123 kB Then there's the question of when the next unit should be used. Should the cut-off point be “there needs to be a 1 in front of the decimal point”? In some cases printing 0.9 GB might be preferable to 900 MB. I could probably go on and find further questions, but I believe this already showcases how the devil is in the details - as with many RFCs that look great on a surface level. In this specific case of usefully formatting numbers, I can recommend taking a look at the NumberFormatter class of ext/intl (https://www.php.net/manual/en/class.numberformatter.php). The API is not particularly pretty, but it handles the complex details of “correctly formatting a float according to language rules”. AFAICT you would still need to append the correct unit (and divide the number of bytes) yourself, though. Best regards Tim Düsterhus

Thread (10 messages)

« previous php.internals (#126360) next »