Addressing hostname assignment in NetworkManager 1.40

A quick overview of NetworkManager 1.40 hostname assignment handling

In this article, I will explain how NetworkManager-1.40 handles hostname assignment.

Introduction

In networking, ensuring consistency and reliability across system updates is crucial. Recently, we at the Red Hat’s Network Management Team encountered an issue related to hostname assignment during a customer’s upgrade from version 1.18 to version 1.40 using Leapp. This caused the hostname to revert to localhost.localdomain and had an impact on the customer’s production environment. In this blog post, we’ll explore the root cause of this problem, its implications, and the solution we implemented to address it.

Background

NetworkManager plays a key role in managing network configurations on Linux systems. It provides a comprehensive set of features for configuring and managing network connections. Hostname assignment is a helpful function in network environments, particularly for administrative and management purposes, as it allows each device to be uniquely identified by a human-readable name. Getting a generic hostname like localhost.localdomain defeats this purpose, making logging, device management, and network troubleshooting more difficult.

Historical Context

NetworkManager has always used various mechanisms to obtain the system hostname including from configured hostnames from settings, automatic hostnames provided by network sources such as DHCP or VPN configurations, and previously set hostnames retained from earlier settings. A fallback mechanism that NetworkManager supports involves performing a reverse DNS lookup of the IP address assigned to an interface to find a name and then assigning that name to the local machine. This is necessary to ensure that even systems without explicit hostname settings can still have a meaningful hostname for administrative purposes.

In 1.18, NetworkManager relied on GLib to perform this reverse DNS lookup. GLib is a general-purpose, portable utility library that provides many useful data types, macros, type conversions, string utilities, file utilities, and a main loop abstraction, which NetworkManager leverages for efficient event handling, asynchronous operations, and managing network configurations and states. It particularly uses the GLib’s g_resolver_lookup_by_address_async() function to perform the reverse DNS lookup. This process involves the glibc resolver, which uses the NSS modules defined in /etc/nsswitch.conf and ensures that IP addresses could be correctly mapped to hostnames even when DHCP or DNS wasn’t configured. This approach made sure that systems with static IP addresses could still have their hostnames correctly resolved based on entries in /etc/hosts.

However, we faced some problems with this process caused by some NSS modules on some distros (including Fedora). Those modules have higher priority than dns and they can return synthetic (locally generated) results. Refer to NetworkManager.conf(5) manual for detailed configuration options. Such modules are:

  • myhostname, which returns the currently configured hostname when looking up local addresses.
  • resolve which asks systemd-resolved, which can also return non-DNS results. In particular, similarly to myhostname, it returns the current hostname for local addresses.

These locally generated results can be problematic because they override the expected hostname resolution from DNS or /etc/hosts, leading to inconsistencies.

Given this problem, in NM-1.40, significant changes were made to NetworkManager’s hostname assignment mechanism. The new implementation does not read from /etc/hosts but instead relies on systemd-resolved for DNS resolution. The new function resolves an address via DNS, first by using systemd-resolved (disabling synthesized results) and then by starting a separate helper process which configures glibc to only use the dns NSS module and then by performing the resolution. This change aims to ensure correct results by querying systemd-resolved with NO_SYNTHESIZE to avoid synthesized results. If the query to systemd-resolved fails, the helper is spawned to handle the resolution. While the goal for this change was to facilitate and modernize the resolution process, it introduced an unexpected behavior change for systems that rely on /etc/hosts for hostname resolution.

The Problem

After upgrading from NM-1.18 to NM-1.40, a customer’s system hostname changed to localhost.localdomain. This issue emerged because the NIC was configured with a manual IP address, and the hostname was specified only in /etc/hosts. However, in version 1.40, NetworkManager did not check /etc/hosts to assign the hostname, unlike in NM-1.18, where this configuration worked seamlessly.

Reproducing the Issue

We managed to reproduce the issue with the following configuration:

NIC Configuration

nmcli con add type ethernet ifname ethX con-name static-ip ip4 192.0.2.70/24

/etc/hosts

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.0.2.70 myhostname.example.com

/etc/hostname

localhost.localdomain

/etc/NetworkManager/NetworkManager.conf

[main]
dns=none

/etc/resolv.conf

Generated by NetworkManager

With this configuration, NM-1.18 set the hostname to foo, whereas version 1.40 retained it as localhost.localdomain.

Root Cause Analysis

The root cause of the discrepancy was traced back to differences in how hostname assignment is handled between NM-1.18 and NM-1.40. As mentioned before, in version 1.18, NetworkManager utilized g_resolver_lookup_by_address_async(), a GLib function that reads /etc/hosts to ensure hostname assignment for systems with static IP addresses. However, in NetworkManager 1.40, the helper was started only to resolve via the dns module, overlooking the results from /etc/hosts.

Solution

To restore the expected behavior, our team proposed and implemented the following changes:

  • Updating the helper: Previously, the helper resolved via the dns module. Now, it resolves via both dns and files. Additionally, if the systemd-resolved fails, the helper is spawned to resolve using the files module. This ensures the right calls are made depending on the system’s hostname configuration.
  • Enhancing consistency: Modified the nm-daemon-helper to use both dns and files NSS services, thereby ensuring that /etc/hosts is considered during hostname resolution.

Detailed Implementation

The updated workflow is illustrated in the following flowchart:

NM hostname resolution workflow

The key steps include:

  • Static Hostname Check: If a static hostname (one set manually and permanently in /etc/hostname) is set, it is used directly.
  • Device List Evaluation: Build a sorted list of devices eligible for hostname evaluation.
  • Hostname Resolution via DHCP/DNS: Check if the hostname can be obtained from DHCP or DNS.
  • Fallback to /etc/hosts: If DHCP/DNS fails, attempt to resolve the hostname via /etc/hosts.
  • systemd-resolved Check: Use systemd-resolved with NO_SYNTHESIZE to avoid synthetic results.
  • Helper Invocation: If systemd-resolved is not available or fails, spawn a helper to check /etc/hosts.

The changes made to ensure hostname resolution via /etc/hosts is also done can be found in this merge request:

Conclusion

This issue highlighted the need for thorough testing and flexibility in handling hostname assignments across different system configurations. By incorporating feedback from users and conducting a detailed analysis, we were able to enhance NetworkManager’s hostname resolution process in NM-1.40 and further releases, ensuring it meets the high standards expected by our users.

References

Published by on and tagged development .