分类

Qt 进程没有被设置语言环境变量时导致文件路径乱码的问题

2018-04-17 23:04 programming

问题

前几天帮同事调试一个问题, 其表现是这样的:

前端 UI 进程里有个菜单项, 激活之后, 会调用后端的一个 dbus 服务, 这个服务使用 Qt 写的. 菜单会把当前选中的文件的绝对路径传递给后端的这个 dbus 服务, 当路径里有中文 字符时, 路径就不被识别了, 后来经过打印发现, QFileInfo::exists(filepath); 返回 false.

打印一下 filepath 这个变量, 也是正确的.

后来, 用 stat(filepath.toStdString().c_str(), &statbuf); 检查发现, 这个文件 是存在的.

这就奇怪了.

但是, 如果手动在终端里事先启动这个服务, 在前端调用时, 一切都正常, 路径也没有 问题.

反复对照了这两种模式的不同. 之后猜测有可能是字符串转换格式时出了问题, 就想到了环境变量的问题.

打印一下 LANG 以及 LANGUAGE 这两个环境变量, 发现都是空白的, 并没有被设置.

在程序的 main() 函数入口处, 先手动初始化一下这两个变量:

qputenv("LANG", "en_US.UTF8");
qputenv("LANGUAGE", "en_US");

重新跑一下, 确实正常了.

追踪根源

现在来复盘一下整个问题, 首先是 dbus manager 在 exec 一个新的 dbus 服务进程时, 应该是清空了大部环境变量, 下面来定位一下 dbus 的代码.

先下载 dbus 项目的代码, dpkg -l | grep dbus 可以发现:

ii  at-spi2-core                                 2.28.0-1                                amd64        Assistive Technology Service Provider Interface (dbus core)
ii  dbus                                         1.12.6-2                                amd64        simple interprocess messaging system (daemon and utilities)
ii  dbus-user-session                            1.12.6-2                                amd64        simple interprocess messaging system (systemd --user integration)
ii  dbus-x11                                     1.12.6-2                                amd64        simple interprocess messaging system (X11 deps)
ii  libdbus-1-3:amd64                            1.12.6-2                                amd64        simple interprocess messaging system (library)
ii  libdbus-glib-1-2:amd64                       0.110-2                                 amd64        deprecated library for D-Bus IPC
ii  libdbusmenu-glib4:amd64                      16.04.1+17.04.20170109.1-5              amd64        library for passing menus over DBus
ii  libdbusmenu-gtk4:amd64                       16.04.1+17.04.20170109.1-5              amd64        library for passing menus over DBus - GTK-2+ version
ii  libdleyna-connector-dbus-1.0-1:amd64         0.2.0-1                                 amd64        DBus connector module for the dLeyna services
ii  libnet-dbus-perl                             1.1.0-4+b3                              amd64        Perl extension for the DBus bindings
ii  libqt4-dbus:amd64                            4:4.8.7+dfsg-11                         amd64        Qt 4 D-Bus module
ii  libqt5dbus5:amd64                            5.10.1+dfsg-5                           amd64        Qt 5 D-Bus module
ii  libqtdbus4:amd64                             4:4.8.7+dfsg-11                         amd64        Qt 4 D-Bus module library
ii  python-dbus                                  1.2.6-1                                 amd64        simple interprocess messaging system (Python interface)
ii  python3-dbus                                 1.2.6-1                                 amd64        simple interprocess messaging system (Python 3 interface)
ii  qdbus                                        4:4.8.7+dfsg-11                         amd64        Qt 4 D-Bus tool

dbus 那个包的包名是 libdbus-1-3, 之后下载这个包的源码:

$ apt-get source libdbus-1-3

现在在 dbus 源代码目录里, 找一个 execve(), 或者它有可能用这个系统调用的变体.

$ grep -nHR execve
$ dbus/dbus-spawn.c:1072:  execve (argv[0], argv, envp);

通过这个函数作为入口点, 一步步查找谁在调用这个函数, 最终找到了:

static dbus_bool_t
add_bus_environment (BusActivation *activation,
                     DBusError     *error)
{
  const char *type;

  if (!bus_activation_set_environment_variable (activation,
                                                "DBUS_STARTER_ADDRESS",
                                                activation->server_address,
                                                error))
    return FALSE;

  type = bus_context_get_type (activation->context);
  if (type != NULL)
    {
      if (!bus_activation_set_environment_variable (activation,
                                                    "DBUS_STARTER_BUS_TYPE", type,
                                                    error))
        return FALSE;

      if (strcmp (type, "session") == 0)
        {
          if (!bus_activation_set_environment_variable (activation,
                                                        "DBUS_SESSION_BUS_ADDRESS",
                                                        activation->server_address,
                                                        error))
            return FALSE;
        }
      else if (strcmp (type, "system") == 0)
        {
          if (!bus_activation_set_environment_variable (activation,
                                                        "DBUS_SYSTEM_BUS_ADDRESS",
                                                        activation->server_address,
                                                        error))
            return FALSE;
        }
    }

  return TRUE;
}

可以发现, dbus 创建的子进程, 其环境变量就两个.

但是, 另一个问题来了, 为何像 gtk 的程序, 并不存在这样的问题, Qt 程序怎么会这样? 下面来找一下 Qt 里的 QFileInfo::exists(const QString& filepath); 这个函数, 看内部是怎么调用的.

先下载 Qt5 的源代码:

$ dpkg -l | grep libqt5

ii  libqt5concurrent5:amd64                      5.10.1+dfsg-5                           amd64        Qt 5 concurrent module
ii  libqt5core5a:amd64                           5.10.1+dfsg-5                           amd64        Qt 5 core module
ii  libqt5dbus5:amd64                            5.10.1+dfsg-5                           amd64        Qt 5 D-Bus module
ii  libqt5designer5:amd64                        5.10.1-2                                amd64        Qt 5 designer module
ii  libqt5designercomponents5:amd64              5.10.1-2                                amd64        Qt 5 Designer components module
ii  libqt5gui5:amd64                             5.10.1+dfsg-5                           amd64        Qt 5 GUI module
ii  libqt5help5:amd64                            5.10.1-2                                amd64        Qt 5 help module
ii  libqt5multimedia5:amd64                      5.10.1-2                                amd64        Qt 5 Multimedia module
ii  libqt5multimedia5-plugins:amd64              5.10.1-2                                amd64        Qt 5 Multimedia module plugins
ii  libqt5multimediagsttools5:amd64              5.10.1-2                                amd64        GStreamer tools for  Qt 5 Multimedia module
ii  libqt5multimediawidgets5:amd64               5.10.1-2                                amd64        Qt 5 Multimedia Widgets module
ii  libqt5network5:amd64                         5.10.1+dfsg-5                           amd64        Qt 5 network module
ii  libqt5opengl5:amd64                          5.10.1+dfsg-5                           amd64        Qt 5 OpenGL module
ii  libqt5opengl5-dev:amd64                      5.10.1+dfsg-5                           amd64        Qt 5 OpenGL library development files
ii  libqt5positioning5:amd64                     5.10.1+dfsg-3                           amd64        Qt Positioning module
ii  libqt5printsupport5:amd64                    5.10.1+dfsg-5                           amd64        Qt 5 print support module
ii  libqt5qml5:amd64                             5.10.1-4                                amd64        Qt 5 QML module
ii  libqt5quick5:amd64                           5.10.1-4                                amd64        Qt 5 Quick library
ii  libqt5quickparticles5:amd64                  5.10.1-4                                amd64        Qt 5 Quick particles module
ii  libqt5quicktest5:amd64                       5.10.1-4                                amd64        Qt 5 Quick Test library
ii  libqt5quickwidgets5:amd64                    5.10.1-4                                amd64        Qt 5 Quick Widgets library
ii  libqt5sensors5:amd64                         5.10.1-3                                amd64        Qt Sensors module
ii  libqt5sql5:amd64                             5.10.1+dfsg-5                           amd64        Qt 5 SQL module
ii  libqt5sql5-sqlite:amd64                      5.10.1+dfsg-5                           amd64        Qt 5 SQLite 3 database driver
ii  libqt5svg5:amd64                             5.10.1-2                                amd64        Qt 5 SVG module
ii  libqt5svg5-dev:amd64                         5.10.1-2                                amd64        Qt 5 SVG module development files
ii  libqt5test5:amd64                            5.10.1+dfsg-5                           amd64        Qt 5 test module
ii  libqt5waylandclient5:amd64                   5.10.1-3                                amd64        QtWayland client library
ii  libqt5waylandcompositor5:amd64               5.10.1-3                                amd64        QtWayland compositor library
ii  libqt5webchannel5:amd64                      5.10.1-2                                amd64        Web communication library for Qt
ii  libqt5webchannel5-dev:amd64                  5.10.1-2                                amd64        Web communication library for Qt - development files
ii  libqt5webkit5:amd64                          5.212.0~alpha2-9                        amd64        Web content engine library for Qt
ii  libqt5widgets5:amd64                         5.10.1+dfsg-5                           amd64        Qt 5 widgets module
ii  libqt5x11extras5:amd64                       5.10.1-2                                amd64        Qt 5 X11 extras
ii  libqt5x11extras5-dev:amd64                   5.10.1-2                                amd64        Qt 5 X11 extras development files
ii  libqt5xml5:amd64                             5.10.1+dfsg-5                           amd64        Qt 5 XML module

可以看到, libqt5core5a 应该就是那个包了, 不得不说抱怨一下 Debian 把上游的包 拆得太分散了!

下载源代码:

$ apt-get source libqt5core5a

先在 Qt 源代码目录里找一下 QFileInfo 这个类, 因为 Qt 类的命名比较统一:

$ find -iname 'qfileinfo*'
./include/QtWidgets/5.10.1/QtWidgets/private/qfileinfogatherer_p.h
./include/QtCore/5.10.1/QtCore/private/qfileinfo_p.h
./include/QtCore/qfileinfo.h
./include/QtCore/QFileInfoList
./include/QtCore/QFileInfo
./tests/benchmarks/corelib/io/qfileinfo
./tests/benchmarks/corelib/io/qfileinfo/qfileinfo.pro
./tests/auto/corelib/io/qfileinfo
./tests/auto/corelib/io/qfileinfo/qfileinfo.pro
./tests/auto/corelib/io/qfileinfo/qfileinfo.qrc
./src/widgets/dialogs/qfileinfogatherer_p.h
./src/widgets/dialogs/qfileinfogatherer.cpp
./src/corelib/io/qfileinfo_p.h
./src/corelib/io/qfileinfo.h
./src/corelib/io/qfileinfo.cpp

可以看到 src/corelib/io/qfileinfo.cpp 就是我们要找的.

qfileinfo.cpp 里面找到 exists(const QString& filepath); 这个方法, 然后 一步步往上追, 最终就可以找到

//static
bool QFileSystemEngine::fillMetaData(const QFileSystemEntry &entry, QFileSystemMetaData &data,
        QFileSystemMetaData::MetaDataFlags what)
{
    if (Q_UNLIKELY(entry.isEmpty()))
        return emptyFileEntryWarning(), false;

#if defined(Q_OS_DARWIN)
    if (what & QFileSystemMetaData::BundleType) {
        if (!data.hasFlags(QFileSystemMetaData::DirectoryType))
            what |= QFileSystemMetaData::DirectoryType;
    }
#endif
#ifdef UF_HIDDEN
    if (what & QFileSystemMetaData::HiddenAttribute) {
        // OS X >= 10.5: st_flags & UF_HIDDEN
        what |= QFileSystemMetaData::PosixStatFlags;
    }
#endif // defined(Q_OS_DARWIN)

    // if we're asking for any of the stat(2) flags, then we're getting them all
    if (what & QFileSystemMetaData::PosixStatFlags)
        what |= QFileSystemMetaData::PosixStatFlags;

    data.entryFlags &= ~what;

    const QByteArray nativeFilePath = entry.nativeFilePath();
    int entryErrno = 0; // innocent until proven otherwise

    // first, we may try lstat(2). Possible outcomes:
    //  - success and is a symlink: filesystem entry exists, but we need stat(2)
    //    -> statResult = -1;
    //  - success and is not a symlink: filesystem entry exists and we're done
    //    -> statResult = 0
    //  - failure: really non-existent filesystem entry
    //    -> entryExists = false; statResult = 0;
    //    both stat(2) and lstat(2) may generate a number of different errno
    //    conditions, but of those, the only ones that could happen and the
    //    entry still exist are EACCES, EFAULT, ENOMEM and EOVERFLOW. If we get
    //    EACCES or ENOMEM, then we have no choice on how to proceed, so we may
    //    as well conclude it doesn't exist; EFAULT can't happen and EOVERFLOW
    //    shouldn't happen because we build in _LARGEFIE64.
    union {
        QT_STATBUF statBuffer;
        struct statx statxBuffer;
    };
    int statResult = -1;
    if (what & QFileSystemMetaData::LinkType) {
        mode_t mode = 0;
        statResult = qt_lstatx(nativeFilePath, &statxBuffer);
        if (statResult == -ENOSYS) {
            // use lstst(2)
            statResult = QT_LSTAT(nativeFilePath, &statBuffer);
            if (statResult == 0)
                mode = statBuffer.st_mode;
        } else if (statResult == 0) {
            statResult = 1; // record it was statx(2) that succeeded
            mode = statxBuffer.stx_mode;
        }

        if (statResult >= 0) {
            if (S_ISLNK(mode)) {
               // it's a symlink, we don't know if the file "exists"
                data.entryFlags |= QFileSystemMetaData::LinkType;
                statResult = -1;    // force stat(2) below
            } else {
                // it's a reagular file and it exists
                if (statResult)
                    data.fillFromStatxBuf(statxBuffer);
                else
                    data.fillFromStatBuf(statBuffer);
                data.knownFlagsMask |= QFileSystemMetaData::PosixStatFlags
                        | QFileSystemMetaData::ExistsAttribute;
                data.entryFlags |= QFileSystemMetaData::ExistsAttribute;
            }
        } else {
            // it doesn't exist
            entryErrno = errno;
            data.knownFlagsMask |= QFileSystemMetaData::ExistsAttribute;
        }

        data.knownFlagsMask |= QFileSystemMetaData::LinkType;
    }

    // second, we try a regular stat(2)
    if (statResult == -1 && (what & QFileSystemMetaData::PosixStatFlags)) {
        if (entryErrno == 0 && statResult == -1) {
            data.entryFlags &= ~QFileSystemMetaData::PosixStatFlags;
            statResult = qt_statx(nativeFilePath, &statxBuffer);
            if (statResult == -ENOSYS) {
                // use stat(2)
                statResult = QT_STAT(nativeFilePath, &statBuffer);
                if (statResult == 0)
                    data.fillFromStatBuf(statBuffer);
            } else if (statResult == 0) {
                data.fillFromStatxBuf(statxBuffer);
            }
        }

        if (statResult != 0) {
            entryErrno = errno;
            data.birthTime_ = 0;
            data.metadataChangeTime_ = 0;
            data.modificationTime_ = 0;
            data.accessTime_ = 0;
            data.size_ = 0;
            data.userId_ = (uint) -2;
            data.groupId_ = (uint) -2;
        }

        // reset the mask
        data.knownFlagsMask |= QFileSystemMetaData::PosixStatFlags
            | QFileSystemMetaData::ExistsAttribute;
    }

    // third, we try access(2)
    if (what & (QFileSystemMetaData::UserPermissions | QFileSystemMetaData::ExistsAttribute)) {
        // calculate user permissions
        auto checkAccess = [&](QFileSystemMetaData::MetaDataFlag flag, int mode) {
            if (entryErrno != 0 || (what & flag) == 0)
                return;
            if (QT_ACCESS(nativeFilePath, mode) == 0) {
                // access ok (and file exists)
                data.entryFlags |= flag | QFileSystemMetaData::ExistsAttribute;
            } else if (errno != EACCES && errno != EROFS) {
                entryErrno = errno;
            }
        };

        checkAccess(QFileSystemMetaData::UserReadPermission, R_OK);
        checkAccess(QFileSystemMetaData::UserWritePermission, W_OK);
        checkAccess(QFileSystemMetaData::UserExecutePermission, X_OK);

        // if we still haven't found out if the file exists, try F_OK
        if (entryErrno == 0 && (data.entryFlags & QFileSystemMetaData::ExistsAttribute) == 0) {
            if (QT_ACCESS(nativeFilePath, F_OK) == -1)
                entryErrno = errno;
            else
                data.entryFlags |= QFileSystemMetaData::ExistsAttribute;
        }

        data.knownFlagsMask |= (what & QFileSystemMetaData::UserPermissions) |
                QFileSystemMetaData::ExistsAttribute;
    }

#if defined(Q_OS_DARWIN)
    if (what & QFileSystemMetaData::AliasType) {
        if (entryErrno == 0 && hasResourcePropertyFlag(data, entry, kCFURLIsAliasFileKey))
            data.entryFlags |= QFileSystemMetaData::AliasType;
        data.knownFlagsMask |= QFileSystemMetaData::AliasType;
    }

    if (what & QFileSystemMetaData::BundleType) {
        if (entryErrno == 0 && isPackage(data, entry))
            data.entryFlags |= QFileSystemMetaData::BundleType;

        data.knownFlagsMask |= QFileSystemMetaData::BundleType;
    }
#endif

    if (what & QFileSystemMetaData::HiddenAttribute
            && !data.isHidden()) {
        QString fileName = entry.fileName();
        if ((fileName.size() > 0 && fileName.at(0) == QLatin1Char('.'))
#if defined(Q_OS_DARWIN)
                || (entryErrno == 0 && hasResourcePropertyFlag(data, entry, kCFURLIsHiddenKey))
#endif
                )
            data.entryFlags |= QFileSystemMetaData::HiddenAttribute;
        data.knownFlagsMask |= QFileSystemMetaData::HiddenAttribute;
    }

    if (entryErrno != 0) {
        what &= ~QFileSystemMetaData::LinkType; // don't clear link: could be broken symlink
        data.clearFlags(what);
        return false;
    }
    return true;
}

这行是关键.

const QByteArray nativeFilePath = entry.nativeFilePath();

再往上追查, 发现:

void QFileSystemEntry::resolveNativeFilePath() const
{
    if (!m_filePath.isEmpty() && m_nativeFilePath.isEmpty()) {
#ifdef Q_OS_WIN
        QString filePath = m_filePath;
        if (isRelative())
            filePath = fixIfRelativeUncPath(m_filePath);
        m_nativeFilePath = QFSFileEnginePrivate::longFileName(QDir::toNativeSeparators(filePath));
#elif defined(QFILESYSTEMENTRY_NATIVE_PATH_IS_UTF16)
        m_nativeFilePath = QDir::toNativeSeparators(m_filePath);
#else
        m_nativeFilePath = QFile::encodeName(QDir::toNativeSeparators(m_filePath));
#endif
#ifdef Q_OS_WINRT
        while (m_nativeFilePath.startsWith(QLatin1Char('\\')))
            m_nativeFilePath.remove(0,1);
        if (m_nativeFilePath.isEmpty())
            m_nativeFilePath.append(QLatin1Char('.'));
        // WinRT/MSVC2015 allows a maximum of 256 characters for a filepath
        // unless //?/ is prepended which extends the rule to have a maximum
        // of 256 characters in the filename plus the preprending path
        m_nativeFilePath.prepend("\\\\?\\");
#endif
    }
}

关键的一行是:

m_nativeFilePath = QFile::encodeName(QDir::toNativeSeparators(m_filePath));

这行代码会调用 qfile.h 里的:

#if defined(Q_OS_DARWIN)
    // Mac always expects filenames in UTF-8... and decomposed...
    static inline QByteArray encodeName(const QString &fileName)
    {
        return fileName.normalized(QString::NormalizationForm_D).toUtf8();
    }
    static QString decodeName(const QByteArray &localFileName)
    {
        // note: duplicated in qglobal.cpp (qEnvironmentVariable)
        return QString::fromUtf8(localFileName).normalized(QString::NormalizationForm_C);
    }
#else
    static inline QByteArray encodeName(const QString &fileName)
    {
        return fileName.toLocal8Bit();
    }
    static QString decodeName(const QByteArray &localFileName)
    {
        return QString::fromLocal8Bit(localFileName);
    }
#endif

可以发现, 在 Darwin (Mac OS X) 平台上, 会使用 toUtf8() 这个方法把 QString 转换为 QByteArray, 而其它平台上, 直接使用了 QString::toLocal8Bit() 方法.

坑就在这里.

当 Qt 进程运行时, 如果指定不同的编码, toLocal8Bit() 会返回不同的值!

qstring.cpp 里面这样注释的:

toLocal8Bit() returns an 8-bit string using the system's local encoding.

再追一下 toLocal8Bit() 具体做了什么:

QByteArray QString::toLocal8Bit_helper(const QChar *data, int size)
{
    return qt_convert_to_local_8bit(QStringView(data, size));
}

static QByteArray qt_convert_to_local_8bit(QStringView string)
{
    if (string.isNull())
        return QByteArray();
#ifndef QT_NO_TEXTCODEC
    QTextCodec *localeCodec = QTextCodec::codecForLocale();
    if (localeCodec)
        return localeCodec->fromUnicode(string);
#endif // QT_NO_TEXTCODEC
    return qt_convert_to_latin1(string);
}

这里, 当 LANG 这个变量为空时, Qt 进程里, QTextCodec::codecForLocale() 返回 一个空指针, 然后上面的代码, 就会调用 qt_convert_to_latin1(string);.

直接返回了 latin1 编码了的 QByteArray !

而我们期望它是以 utf8 的方式来编码的!

到这里, 就水落石出了.

另附, 将所有日志重定向到文件

他们的项目里混合了 ostreamQDebug, 这是个不好的习惯. 当时为了方便调试, 直接将输出到终端的内容重定向到了一个日志文件, 大致是这样做的, 在 main() 入口:

int fd = open("/tmp/file-manager-daemon.log",
              O_CREAT | O_WRONLY | O_SYNC,
              S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
if (fd == -1) {
  perror("open() failed to open log file!");
  exit(1);
}

dup2(fd, STDOUT_FILENO);
dup2(fd, STDERR_FILENO);