Skip to content

Check existing parameter naming for consistency, clarity and possible brevity #16

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bwoebi opened this issue Jul 24, 2020 · 21 comments
Closed
Labels
help wanted Extra attention is needed

Comments

@bwoebi
Copy link
Member

bwoebi commented Jul 24, 2020

General issue to track naming suggestions:

Let me start with string functions...

General name conventions:

context current name suggestion reason
* $string $str $str is a very common abbreviation, and effectively just repeating the typehint. Writing $string over $str as parameter name has no intrinsic clarity gain. Also e.g. strtr() uses $str, while str_split uses $string.
string functions $data $str $data suggests, without reading the source signature, that something may be not just a string.
* $is_ prefix $ Drop is_ prefixes for booleans, they are anyway given by the type; is_something: true in a named function call does not help over something: true
comparing functions $frist, $second $str1, $str2 E.g. similar_text, keep it consistent with strcmp(). (for non-string types, obviously not a "str" prefix)
function current parameters suggestion reason
str_i?replace $search, $replace, $subject, $count $from, $to, $subject is $search what shall be serached for or in? Also mirroing strtr() then
strc?spn $subject, $mask, $start, $length $str, $mask, $start, $length $subject does not provide any clarity over generic short $str
addcslashes $str, $charlist $str, $mask consistency with $mask from strspn
chunk_split $body, $chunklen, $end $str, $chunklen, $end The notion of $body is derived from the usage in mails, but the function is far more generic than that
[lr]?trim $str, $character_mask $str, $mask consistency with $mask from strspn
(sha1|md5)(_file) $filename/$str, $raw_output $filename/$str, $binary what is "raw" even supposed to mean? We want binary, i.e. not a hex representation
str_word_count $string, $format, $charlist $str, $format, $mask consistency with $mask from strspn
strpbrk $haystack, $char_list $haystack, $mask consistency with $mask from strspn
substr_compare $main_str, $str, $offset, $length, $case_insensitivity ??, ??, $offset, $length, $case_insensitivity Not sure, but $main_str and $str is weird - maybe needle and haystack?
@nikic
Copy link
Member

nikic commented Jul 24, 2020

See also https://gist.github.com/nikic/b1223525b8bb0a1bef0f6b8a267318e9 for a list of currently used parameter names, as of a week ago.

@beberlei
Copy link

beberlei commented Jul 25, 2020

Question: DOMDocument uses $filname a lot, the whole API is mostly camel case, so question would be $fileName or $filename?

Edit: Mark made a good suggestion to rename these in dom explicitly to $path

@derickr
Copy link
Member

derickr commented Jul 25, 2020 via email

@derickr
Copy link
Member

derickr commented Jul 26, 2020

I've given ext/date another pass. Should be done now.

In general, I'm not a fan of short names like $str over $string, and I am a fan of clarify over conciseness. Try also pronouncing $str.

@bwoebi
Copy link
Member Author

bwoebi commented Jul 26, 2020

@derickr For the specific instance of str vs string I was also taking inspiration of other languages, e.g. strcmp in C has str1 and str2. In mysql the string to be replaced is called str in replace function. In D the parameters are even called just S (e.g. split()). etc. I see an overwhelmingly lot of abbreviations for string in the wild, most commonly str. I'd argue we really are in line with other languages if we always use $str. I agree with you that unnecessary abbreviating should be avoided, but when a specific abbreviation has a lot of precedent, I see no gain in writing it out.

@kocsismate
Copy link
Member

FWIW I like most of the suggestions in the OP, except for $str. :) I think parameter names are not something that needs to be in sync with other languages, so that we can set our own policy.

However, Nikita also expressed that he'd prefer abbreviations where it's possible, so I think we should come up with some heuristics when to consider to use them:

  • when an abbreviations is well-estabished (This was already mentioned in many comments)
  • when an abbreviation is significantly shorter than the non-abbreviated form

While $str definitely falls in the first category, it does not in the second (only 3 characters shorter). In the same time, e.g. $char, $arg, and probably $param (5, 5, 4 characters shorter) fall in both in my opinion.

Even if we agreed to use the above rule of thumbs, we'd be using abbreviations pretty much inconsistently, because parameter names would be mostly non-abbreviated, with some "random" exceptions. In order to fix this inconsistency, we could have another heuristics:

  • we should use an abbreviation when it is also used in written text

For example, it's not likely that someone writes This function returns a str, while it's more likely to see sentences like The first param is optional, The first arg passed to the function is an integer. This way, I think we could alleviate the inconsistency of mixing abbreviations with full names.

@TysonAndre
Copy link

https://gist.github.com/TysonAndre/8d71559febf58f6af41cce009d4fc704#file-gistfile1-txt-L15 has some notes on the differences between php-src and Phan, which uses function signatures from a variety of places

  • ignore str vs string, which I forgot about and renamed too early.

There's a collection of scripts I wrote in https://github.com/phan/phan/tree/master/internal (internalsignatures.php and lib/) which were originally designed for extracting union types, but can be adapted for extracting parameter names. They're not exactly designed for reuse, but it's still doable (extract all from stubs file, extract all from svn, and compare where they differ in naming).

  • This expects to be run with php 8.0-dev
  • Some bug fixes I found aren't in a stable release yet
  • Instead of comparing with phan's names from a variety of sources, it would be more useful to compare the svn docs repo and *.stub.php contents to get those consistent
  • Seeing the names phan extracted over a long period of time from a variety of sources such as proto comments may be helpful in considering alternate names that could be used for public APIs
  • internal/lib/IncompatibleSignatureDetectorBase.php generated that gist

This also found some bugs.

Name mismatch for imagesetclip: #3 is $y1 in Phan, $x2 in source
Name mismatch for imagesetclip: #4 is $x2 in Phan, $y1 in source
	zend_long x1, y1, x2, y2;

	if (zend_parse_parameters(ZEND_NUM_ARGS(), "Ollll", &im_zval, gd_image_ce, &x1, &y1, &x2, &y2) == FAILURE) {
		RETURN_THROWS();
	}

  • How should resource handles be named? $handle? $stream? acronyms specific to the library (e.g. $ftp)?
  • Breaking this down per module (docs changes and arginfo changes) would probably be easiest to review

@Crell
Copy link

Crell commented Aug 21, 2020

Oh, this is a fun thread. Related: php/php-src#6027

In chat, Levi suggested standardizing array functions on "input" and "callback" (where it's currently a combination of array, array1, or various less sensible things). That seems sensible to me if it does to everyone else.

@Crell
Copy link

Crell commented Aug 21, 2020

Another related PR: php/php-src#6032

@TysonAndre
Copy link

$input seems reasonable - it'd hopefully make it easier to remember the names if it was used everywhere where there was one input array.

It also works better when there's also collection support, or for array_* functions that also accept non-traversable objects.


  • input seems slightly clearer than $arg or $value, but not quite as clear as $array when learning the language or seeing error messages (e.g. array_filter(): Argument #1 ($array) must be of type array, Closure given for positional parameters)
    At least it'd be better than $array for count and easier to remember if all the most common array methods used the same thing
  • What about names of sort(), array_shift(), etc - those only work with arrays and the reference parameter is both an input and an output. php.net has $array for sort
  • https://www.php.net/manual/en/function.iterator-apply.php uses $iterator - should that and others also change to input?
  • $collection or $iterable is also possible, but seems worse than $array or $input.

Maybe we should focus on adding positional-only parameters to php 8.0 to avoid this if we can't decide (I'm kidding) -- I doubt the release managers would give that an extension, and that addition may break extensions such as debuggers until they implemented support

@Crell
Copy link

Crell commented Aug 21, 2020

For most of the array functions, they take only an array and don't modify it, so $input seems reasonable to me. (I also considered $source, but I like $input better.) For the moment I'm not dealing with iterables; I don't expect array_map() to ever start taking an iterable, honestly.

For mutating functions like sort() et al, $array is probably the best option. Anything else gets kind of abstract, like $subject. Absent any better ideas I'm good leaving those as is.

@nikic
Copy link
Member

nikic commented Aug 22, 2020

I prefer $array over $input. "Input" is very generic and could refer to literally any parameter.

@TysonAndre
Copy link

I prefer $array over $input. "Input" is very generic and could refer to literally any parameter.

After some consideration, I feel the same way

There's also the inconsistency of having generic object functions accept $object, some generic string functions accept $str/$string, feof accepting handle/resource, but array functions expecting $input.

It'd be easier to see wrongness visually with array_filter(array: $x->getTraversable(), callback: $cb) than array_filter(input: $x->getTraversable(), callback: $cb), or more obscure functions in other extensions

@TysonAndre
Copy link

TysonAndre commented Aug 25, 2020

I've put together a script to compare the stubs used by php-src 8.0 with the parameter names in php.net

https://gist.github.com/TysonAndre/1994cc681ad04109b7ab3f2fd7407b6f
The generated output from https://github.com/phan/phan/commits/1c7bb0b5ea3fa46460806c33895a5a509af187b3 (excludes some files with repeated basenames such as calendar.stub.php, easy enough to rename)

There's around 800 mismatches. Some notable differences:

# Run with php8.0 internal/internalsignatures.php compare-named-parameters ../php-src-stubs ../phpdoc-en
# from https://github.com/phan/phan/commits/1c7bb0b5ea3fa46460806c33895a5a509af187b3

Parsed 5173 signatures from stubs and 9704 from documentation
Comparing signatures
Saw parameter name mismatch for ArrayIterator::asort
Reflection parameters: {"sort_flags=":"int"}
php.net documentation: []

Saw parameter name mismatch for ArrayIterator::ksort
Reflection parameters: {"sort_flags=":"int"}
php.net documentation: []

Saw parameter name mismatch for ArrayIterator::offsetSet
Reflection parameters: {"index":"mixed","value":"mixed"}
php.net documentation: {"index":"mixed","newval":"mixed"}

Saw parameter name mismatch for ArrayObject::asort
Reflection parameters: {"sort_flags=":"int"}
php.net documentation: []

....

Saw parameter name mismatch for curl_strerror
Reflection parameters: {"error_number":"int"}
php.net documentation: {"errornum":"int"}

Saw parameter name mismatch for curl_unescape
Reflection parameters: {"handle":"\\CurlHandle","string":"string"}
php.net documentation: {"ch":"resource","str":"string"}

Saw parameter name mismatch for current
Reflection parameters: {"arg":"array|object"}
php.net documentation: {"array":"array"}

....
Saw parameter name mismatch for usort
Reflection parameters: {"&array":"array","cmp_function":"callable"}
php.net documentation: {"&array":"array","value_compare_func":"callable"}

Saw parameter name mismatch for var_dump
Reflection parameters: {"value":"mixed","...values=":"list<mixed>"}
php.net documentation: {"expression":"mixed","...=":"mixed"}

Saw parameter name mismatch for var_export
Reflection parameters: {"value":"mixed","return=":"bool"}
php.net documentation: {"expression":"mixed","return=":"bool"}
...

@Crell
Copy link

Crell commented Aug 27, 2020

New PR that covers all of array functions: php/php-src#6046

Docs to be updated accordingly after that gets merged.

@dtakken
Copy link

dtakken commented Sep 18, 2020

I was just wondering what to do with extensions that expose a dual object oriented / procedural API. For these extensions the class methods have equivalent global functions.

We use different parameter naming styles for functions and methods. However, it seems like the dual APIs share the signatures between a method and its equivalent function. That means we presently cannot use different parameter naming styles for the two APIs without changing something about the method/function mapping mechanism.

Example: https://github.com/php/php-src/blob/b4c2670f857bd23c8b0922bd7f0f4a25d91f28d2/ext/xmlwriter/php_xmlwriter.stub.php

Solving this requires changes to the stub parser, right?

@kocsismate
Copy link
Member

@dtakken I think there's no technical issue with using different parameter names in aliases. When I was making changes to this file today (I assume you saw it? :) ) I was close to rename the parameters in question to use the correct case, but I wanted to have some discussion first, so changed my mind.

@dtakken
Copy link

dtakken commented Sep 18, 2020

@kocsismate Ok then I do not understand how it works. :) I only see it generate one function signature in the *_arginfo.h file. I do see that an @alias comment can change the name of the resulting function, but not its parameters?

Yes, I saw your changes. Great job.

@kocsismate
Copy link
Member

kocsismate commented Sep 18, 2020

I only see it generate one function signature in the *_arginfo.h file.

The alias in the context of stubs only means that the same C code (PHP_FUNCTION(xmlwriter_write_dtd_entity)) is to be invoked for both the aliased function and the alias method. On the other hand, both of them have a separate arginfo structure, since the return type of these functions and the number of parameters are already different (https://github.com/php/php-src/blob/d9628b9ca961d634e81e5c4d4fcb218873ae9159/ext/xmlwriter/php_xmlwriter_arginfo.h#L158 and https://github.com/php/php-src/blob/d9628b9ca961d634e81e5c4d4fcb218873ae9159/ext/xmlwriter/php_xmlwriter_arginfo.h#L313).

@dtakken
Copy link

dtakken commented Sep 18, 2020

@kocsismate Ah yes, I see it now. I was looking into it to see how I might help a bit with this effort. Now I know. Thanks!

nicolas-grekas added a commit to symfony/polyfill that referenced this issue Oct 23, 2020
This PR was merged into the 1.19-dev branch.

Discussion
----------

Fix the name of arguments for PHP 8

With PHP 8 coming, we need to have function/methods parameters names consistent between polyfills and the original implementations.

I'm not sure if this is all of it yet.

see:
php/php-tasks#23
php/php-tasks#16

Commits
-------

f75e14e Fix the name of arguments for PHP 8
@nikic
Copy link
Member

nikic commented Dec 3, 2020

With PHP 8 released, I think it's safe to close this...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

8 participants