scripts/pythondistdeps: Notes from an attempted rewrite to importlib.metadata
Notes from an attempted rewrite from pkg_resources to importlib.metadata in 2020:
1. While pkg_resources can open a metadata on a specified path
(Distribution.from_location()), importlib provides access only to
"installed package metadata", i.e. the the dist-info or egg-info directory
must be "discoverable", i.e. on the sys.path.
- Thankfully only the dist/egg-info directory must exist, the
corresponding Python module does not have to be present.
- The problems this causes:
(a) You have to manipulate the sys.path to add the specific location of
the site-packages directory inside the buildroot
(b) If you have package "foo" in this newly added directory on sys.path
and there is some problem and its dist/egg-info metadata are not found,
importlib.metadata continues searching the sys.path and may discover a
package with the same name (possibly same version) outside the
buildroot.
To get around this, you can manipulate the sys.path to remove all
other "site-packages" directories. But you have to leave the
standard library there, because importlib may import other modules
(in my testing: base64, quopri, random, socket, calendar, uu)
(c) I have not tested how well it works if you're ispecting metadata of
different Python versions than the one you run the script with
(especially Python 2 vs Python 3). This might also cause problems with
dependency specifiers (i.e. python_version != "3.4")
2. Handling of dependencies (requires) is problematic in importlib.metadata
- pkg_resources provides a way to separately list standard requires and a
requires for each "extras" category. importlib does not provide this, it
only spits out a list of strings, each string in the format:
- 'packaging>=14',
- 'towncrier>=18.5.0; extra == "docs"', or
- 'psutil<6,>=5.6.1; (python_version != "3.4") and extra == "testing"
you can either parse these with a regex (fragile) or use the external
`packaging` Python module. `packaging`, however, also doesn't have a great
support for figuring out extra dependencies, it provides the marker api:
- <Marker(\'python_version != "3.4" and extra == "testing"\')>
you can use Marker api to evaluate the condition, but not to parse.
For parsing you can access the private api Marker._markers:
- marker._markers=[[(<Variable('python_version')>, <Op('!=')>, \
<Value('3.4')>)], 'and', (<Variable('extra')>, <Op('==')>, \
<Value('testing')>)]
which beyond the problem of being private is also not very useful for
parsing due to its structure.
- pkg_resources also provides version parsing, which importlib does not
and `packaging` needs to be used
- importlib is part of the standard library, but packaging and its
2 runtime dependencies (pyparsing and six) are not, and therefore we
would go from 1 dependency to 3
3. A few minor issues, more in the next section about equivalents.
importlib.metadata.distribution equivalents of pkg_resources.Distribution attributes:
- pkg_resources: dist.py_version
importlib: # not implemented (but can be guessed from the /usr/lib/pythonXX.YY/ path)
- pkg_resources: dist.project_name
importlib: dist.metadata['name']
- pkg_resources: dist.key
importlib: # not implemented
- pkg_resources: dist.version
importlib: dist.version
- pkg_resources: dist.requires()
importlib: dist.requires # but returns strings with almost no parsing done, and also lists extras
- pkg_resources: dist.requires(extras=dist.extras)
importlib: # not implemented, has to be parsed from dist.requires
- pkg_resources: dist.get_entry_map('console_scripts')
importlib: [ep for ep in importlib.metadata.entry_points()['console_scripts'] if ep.name == pkg][0]
# I have not found a better way to get the console_scripts
- pkg_resources: dist.get_entry_map('gui_scripts')
importlib: # Presumably same as console_scripts, but untested