In this article, I will explain how NetworkManager-1.40 handles hostname assignment.
Introduction
In networking, ensuring consistency and reliability across system updates is crucial. Recently, we at the Red Hat’s Network Management Team encountered an issue related to hostname assignment during a customer’s upgrade from version 1.18 to version 1.40 using Leapp. This caused the hostname to revert to localhost.localdomain
and had an impact on the customer’s production environment. In this blog post, we’ll explore the root cause of this problem, its implications, and the solution we implemented to address it.
Background
NetworkManager plays a key role in managing network configurations on Linux systems. It provides a comprehensive set of features for configuring and managing network connections. Hostname assignment is a helpful function in network environments, particularly for administrative and management purposes, as it allows each device to be uniquely identified by a human-readable name. Getting a generic hostname like localhost.localdomain
defeats this purpose, making logging, device management, and network troubleshooting more difficult.
Historical Context
NetworkManager has always used various mechanisms to obtain the system hostname including from configured hostnames from settings, automatic hostnames provided by network sources such as DHCP or VPN configurations, and previously set hostnames retained from earlier settings. A fallback mechanism that NetworkManager supports involves performing a reverse DNS lookup of the IP address assigned to an interface to find a name and then assigning that name to the local machine. This is necessary to ensure that even systems without explicit hostname settings can still have a meaningful hostname for administrative purposes.
In 1.18, NetworkManager relied on GLib to perform this reverse DNS lookup. GLib is a general-purpose, portable utility library that provides many useful data types, macros, type conversions, string utilities, file utilities, and a main loop abstraction, which NetworkManager leverages for efficient event handling, asynchronous operations, and managing network configurations and states. It particularly uses the GLib’s g_resolver_lookup_by_address_async()
function to perform the reverse DNS lookup. This process involves the glibc resolver, which uses the NSS modules defined in /etc/nsswitch.conf
and ensures that IP addresses could be correctly mapped to hostnames even when DHCP or DNS wasn’t configured. This approach made sure that systems with static IP addresses could still have their hostnames correctly resolved based on entries in /etc/hosts
.
However, we faced some problems with this process caused by some NSS modules on some distros (including Fedora). Those modules have higher priority than dns
and they can return synthetic (locally generated) results. Refer to NetworkManager.conf(5) manual for detailed configuration options. Such modules are:
myhostname
, which returns the currently configured hostname when looking up local addresses.resolve
which askssystemd-resolved
, which can also return non-DNS results. In particular, similarly tomyhostname
, it returns the current hostname for local addresses.
These locally generated results can be problematic because they override the expected hostname resolution from DNS or /etc/hosts
, leading to inconsistencies.
Given this problem, in NM-1.40, significant changes were made to NetworkManager’s hostname assignment mechanism. The new implementation does not read from /etc/hosts
but instead relies on systemd-resolved
for DNS resolution. The new function resolves an address via DNS, first by using systemd-resolved
(disabling synthesized results) and then by starting a separate helper process which configures glibc to only use the dns
NSS module and then by performing the resolution. This change aims to ensure correct results by querying systemd-resolved
with NO_SYNTHESIZE
to avoid synthesized results. If the query to systemd-resolved
fails, the helper is spawned to handle the resolution. While the goal for this change was to facilitate and modernize the resolution process, it introduced an unexpected behavior change for systems that rely on /etc/hosts
for hostname resolution.
The Problem
After upgrading from NM-1.18 to NM-1.40, a customer’s system hostname changed to localhost.localdomain
. This issue emerged because the NIC was configured with a manual IP address, and the hostname was specified only in /etc/hosts
. However, in version 1.40, NetworkManager did not check /etc/hosts
to assign the hostname, unlike in NM-1.18, where this configuration worked seamlessly.
Reproducing the Issue
We managed to reproduce the issue with the following configuration:
NIC Configuration
nmcli con add type ethernet ifname ethX con-name static-ip ip4 192.0.2.70/24
/etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.0.2.70 myhostname.example.com
/etc/hostname
localhost.localdomain
/etc/NetworkManager/NetworkManager.conf
[main]
dns=none
/etc/resolv.conf
Generated by NetworkManager
With this configuration, NM-1.18 set the hostname to foo
, whereas version 1.40 retained it as localhost.localdomain
.
Root Cause Analysis
The root cause of the discrepancy was traced back to differences in how hostname assignment is handled between NM-1.18 and NM-1.40. As mentioned before, in version 1.18, NetworkManager utilized g_resolver_lookup_by_address_async()
, a GLib function that reads /etc/hosts
to ensure hostname assignment for systems with static IP addresses. However, in NetworkManager 1.40, the helper was started only to resolve via the dns
module, overlooking the results from /etc/hosts
.
Solution
To restore the expected behavior, our team proposed and implemented the following changes:
- Updating the helper: Previously, the helper resolved via the
dns
module. Now, it resolves via bothdns
andfiles
. Additionally, if thesystemd-resolved
fails, the helper is spawned to resolve using thefiles
module. This ensures the right calls are made depending on the system’s hostname configuration. - Enhancing consistency: Modified the
nm-daemon-helper
to use bothdns
andfiles
NSS services, thereby ensuring that/etc/hosts
is considered during hostname resolution.
Detailed Implementation
The updated workflow is illustrated in the following flowchart:
The key steps include:
- Static Hostname Check: If a static hostname (one set manually and permanently in
/etc/hostname
) is set, it is used directly. - Device List Evaluation: Build a sorted list of devices eligible for hostname evaluation.
- Hostname Resolution via DHCP/DNS: Check if the hostname can be obtained from DHCP or DNS.
- Fallback to /etc/hosts: If DHCP/DNS fails, attempt to resolve the hostname via
/etc/hosts
. - systemd-resolved Check: Use
systemd-resolved
withNO_SYNTHESIZE
to avoid synthetic results. - Helper Invocation: If
systemd-resolved
is not available or fails, spawn a helper to check/etc/hosts
.
The changes made to ensure hostname resolution via /etc/hosts
is also done can be found in this merge request:
Conclusion
This issue highlighted the need for thorough testing and flexibility in handling hostname assignments across different system configurations. By incorporating feedback from users and conducting a detailed analysis, we were able to enhance NetworkManager’s hostname resolution process in NM-1.40 and further releases, ensuring it meets the high standards expected by our users.