Fwd: Installation hell

Eryk Sun eryksun at gmail.com
Tue Dec 20 08:11:00 EST 2022


On 12/19/22, Chris Angelico <rosuav at gmail.com> wrote:
> On Tue, 20 Dec 2022 at 09:12, Thomas Passin <list1 at tompassin.net> wrote:
>
>> @echo off
>> setlocal
>> : Find effective drive for this file.
>> set ed=%~d0
>> path %ed%\python37\Scripts;%ed%\python37;%PATH%

For reference, in case not everyone on the list knows what "%~d0"
means, the CMD shell supports extracting the drive (d), path (p), name
(n), and extension (x) components of a path name that's stored in a
parameter such as "%0". The full path (f) is resolved beforehand. For
example:

    C:\Temp>set var=spam\eggs.py

    C:\Temp>for %c in (%var%) do @echo drive: "%~dc"
    drive: "C:"

    C:\Temp>for %c in (%var%) do @echo path: "%~pc"
    path: "\Temp\spam\"

    C:\Temp>for %c in (%var%) do @echo name: "%~nc"
    name: "eggs"

    C:\Temp>for %c in (%var%) do @echo extension: "%~xc"
    extension: ".py"

    C:\Temp>for %c in (%var%) do @echo full path: "%~dpnxc"
    full path: "C:\Temp\spam\eggs.py"

    C:\Temp>for %c in (%var%) do @echo full path: "%~fc"
    full path: "C:\Temp\spam\eggs.py"

 > So much easier to do on a Unix-like system, where you don't need to
> concern yourself with "effective drive" and can simply use relative
> paths.

A relative path in the PATH environment variable would depend on the
current working directory. Surely the added paths need to be absolute.
However, Thomas didn't have to reference the drive explicitly. The
expression "%~dp0" is the fully-qualified directory of the executing
batch script, and an absolute path can reference its ancestor
directories using ".." components.

> I know we're not here to bash Windows, but... drive letters
> really need to just die already.

I don't foresee drive-letter names getting phased out of Windows. And
Windows itself is unlikely to get phased out as long as Microsoft
continues to profit from it, as it has for the past 37 years.

The drive concept is deeply ingrained in the design of NT, the Windows
API, shells, and applications. While assigning drive names "A:", "B:",
and "D:" to "Z:" can be avoided, the system volume, i.e. drive "C:",
still has to be accessed in the normal way, or using another one of
its persistent names, such as r"\\?\BootPartition".

The latter still uses the filesystem mount point on the root path of
the device (e.g. "\\\\?\\BootPartition\\"), which you probably take
issue with. That's a deeply ingrained aspect of Windows. Even mount
points set on filesystem directories are actually bind mount points
that ultimately resolve to the root path on the volume device (e.g.
"\\Device\\HarddiskVolume4\\").  This differs from how regular mount
points work on Unix, for which a path like "/dev/sda1/etc" is
gibberish.

Below I've outlined the underlying details of how logical drives (e.g.
"C:"), UNC shares (e.g. r"\\server\share"), other device names, and
filesystem mount points are implemented on NT.

---

NT Device Names

In contrast to Unix, NT is organized around an object namespace, not a
root filesystem. Instances of many object types can be named. Some
named object types also support a parse routine for paths in the
namespace of an object (e.g. the configuration manager's registry
"Key" type and the I/O manager's "Device" type).

The object manager uses two object types to define the object
namespace: Directory and Symbolic Link. Directory objects form the
hierarchical tree. At the base of the tree is the anonymous root
directory object (i.e. "\\"). A directory is implemented as a hashmap
of named objects. A directory can be set as the shadow of another
directory, creating a union directory for name lookups.

Unless otherwise stated, the following discussion uses "directory" and
"symlink" to refer to a directory object and a symbolic-link object,
respectively -- not to a filesystem directory or filesystem symlink.

A canonical NT device name (e.g. "C:", "PIPE", "UNC") is implemented
in the object namespace as a symlink that targets the path of a real
device object. The real device is typically in the r"\Device"
directory. A canonical device name might be a persistent name for an
enumerated device (e.g. "C:" -> r"\Device\HarddiskVolume2"). In some
cases the real device name is persistent, but it's different from the
canonical name (e.g. "PIPE" -> r"\Device\NamedPipe", or "UNC" ->
r"\Device\Mup").

The symlink that implements a canonical device name is created either
in the r"\Global??" directory or in a directory that's used for local
device names in a given logon session (e.g.
r"\Sessions\0\DosDevices\<logon session id>"). The global device
directory generally contains system devices. Mapped drives and
substitute drives typically use a local device directory, so users
don't have to worry about conflicting drive assignments in these
cases.

The global device directory is the shadow of each local device
directory, forming a union for name lookups. If the same device name
is defined in both the local and global directories, the local device
name takes precedence. However, each local device directory also
contains a symlink named "Global" to explicitly reference the global
device directory.

In NT 5.1+, the object path r"\??" is implemented as a virtual
directory. It always references the local device directory of the
current thread's logon session. For example, r"\??\Z:" is probably a
mapped drive in the current logon session, while r"\??\C:" is the
global "C:" drive, which can be written more explicitly as
r"\??\Global\C:".

The target of a symlink can be any path, including a directory in a
filesystem. The latter is the case for mapped drives (e.g. "Z:" ->
r"\Device\Mup\server\share") and substitute drives (e.g. "W:" ->
r"\??\C:\Windows"). Note that drive names that target a filesystem
directory instead of a device are peculiar. API functions generally
special case mapped drives, which target a path on r"\Device\Mup".
Substitute drives, on the other hand, may cause some API functions to
misbehave, such as GetVolumePathNameW().

The Windows API reserves two UNC path prefixes that translate to the
"\\??\\" virtual directory path in NT. The prefix "\\\\?\\"
(backslashes only) begins a path that opens literally, and the prefix
"\\\\.\\" (any mix of forward slashes and backslashes) begins a path
that opens normalized. The normalization of a path replaces forward
slashes with backlashes, collapses repeated slashes, resolves "." and
".." components, and strips trailing spaces and dots.

For example, "//./C:/spam/foo/..///eggs. . ." opens normalized as the
NT path r"\??\C:\spam\eggs", while r"\\?\C:\spam\eggs. . ." opens
literally as the NT path r"\??\C:\spam\eggs. . .". Literal paths have
to be used carefully, else one might create filenames such as "eggs. .
." that can't be accessed normally.

Finally, the global device directory contains a symlink named
"GlobalRoot" that targets the root directory in the object namespace.
It's not particularly useful in the NT API, which can directly access
the root directory. The Windows API, however, requires defining a new
device via DefineDosDeviceW(), if not for the "GlobalRoot" symlink.
For example, the NT path r"\Device\ConDrv\Output" can be opened in the
Windows API as r"\\?\GlobalRoot\Device\ConDrv\Output".

---

NT Mount Points

The primary mount point for a filesystem is usually the root path of
the device that contains the filesystem (e.g. "\\??\\PIPE\\"). For a
volume device, it's called a volume mount point (e.g. "\\??\\C:\\").
This is automatic by default. As soon as a volume device comes online,
an existing filesystem on the device gets recognized and mounted on
the root path of the device.

A common exception is the r"\??\UNC" device, which virtually mounts
UNC shares such as r"\??\UNC\server\share". A UNC share is the root
directory of a redirected filesystem or virtual filesystem. When a new
share path is encountered, the "UNC" device (really r"\Device\Mup")
sends it simultaneously to a prioritized list of providers in order to
determine which providers handle it, if any.

A bind mount point (i.e. a filesystem junction) can be set on an empty
filesystem directory. The target must be a path on a local volume
device. This is basically like the result of the JOIN command in
MS-DOS, except it's generalized to allow grafting a subtree instead of
an entire volume, and the joined volume isn't required to have a
logical drive name.  A junction can also be used instead of a
substitute drive (e.g. a drive created by subst.exe), either of which
makes working with a long base path more convenient.

A junction that targets the root path of a volume name can be
registered with the mount-point manager as the volume's canonical
mount point. A volume name is based on the volume's persistent GUID.
For example, the target path will look like
"\\??\\Volume{12345678-1234-1234-1234-123456781234}\\".

UNC paths can't be mounted on a filesystem directory. However, a
filesystem symlink can target a UNC path. This can be used in place of
a mapped drive. That said, the canonical mount point for a UNC share
is always of the form r"\??\UNC\server\share". It's never a mapped
drive, and it's definitely never a filesystem symlink.


More information about the Python-list mailing list