Qt 进程没有被设置语言环境变量时导致文件路径乱码的问题
2018-04-17 23:04 programming
问题
前几天帮同事调试一个问题, 其表现是这样的:
前端 UI 进程里有个菜单项, 激活之后, 会调用后端的一个 dbus 服务, 这个服务使用 Qt
写的. 菜单会把当前选中的文件的绝对路径传递给后端的这个 dbus 服务, 当路径里有中文
字符时, 路径就不被识别了, 后来经过打印发现, QFileInfo::exists(filepath);
返回
false
.
打印一下 filepath
这个变量, 也是正确的.
后来, 用 stat(filepath.toStdString().c_str(), &statbuf);
检查发现, 这个文件
是存在的.
这就奇怪了.
但是, 如果手动在终端里事先启动这个服务, 在前端调用时, 一切都正常, 路径也没有 问题.
反复对照了这两种模式的不同. 之后猜测有可能是字符串转换格式时出了问题, 就想到了环境变量的问题.
打印一下 LANG
以及 LANGUAGE
这两个环境变量, 发现都是空白的, 并没有被设置.
在程序的 main()
函数入口处, 先手动初始化一下这两个变量:
qputenv("LANG", "en_US.UTF8");
qputenv("LANGUAGE", "en_US");
重新跑一下, 确实正常了.
追踪根源
现在来复盘一下整个问题, 首先是 dbus manager 在 exec
一个新的 dbus 服务进程时,
应该是清空了大部环境变量, 下面来定位一下 dbus 的代码.
先下载 dbus 项目的代码, dpkg -l | grep dbus
可以发现:
ii at-spi2-core 2.28.0-1 amd64 Assistive Technology Service Provider Interface (dbus core)
ii dbus 1.12.6-2 amd64 simple interprocess messaging system (daemon and utilities)
ii dbus-user-session 1.12.6-2 amd64 simple interprocess messaging system (systemd --user integration)
ii dbus-x11 1.12.6-2 amd64 simple interprocess messaging system (X11 deps)
ii libdbus-1-3:amd64 1.12.6-2 amd64 simple interprocess messaging system (library)
ii libdbus-glib-1-2:amd64 0.110-2 amd64 deprecated library for D-Bus IPC
ii libdbusmenu-glib4:amd64 16.04.1+17.04.20170109.1-5 amd64 library for passing menus over DBus
ii libdbusmenu-gtk4:amd64 16.04.1+17.04.20170109.1-5 amd64 library for passing menus over DBus - GTK-2+ version
ii libdleyna-connector-dbus-1.0-1:amd64 0.2.0-1 amd64 DBus connector module for the dLeyna services
ii libnet-dbus-perl 1.1.0-4+b3 amd64 Perl extension for the DBus bindings
ii libqt4-dbus:amd64 4:4.8.7+dfsg-11 amd64 Qt 4 D-Bus module
ii libqt5dbus5:amd64 5.10.1+dfsg-5 amd64 Qt 5 D-Bus module
ii libqtdbus4:amd64 4:4.8.7+dfsg-11 amd64 Qt 4 D-Bus module library
ii python-dbus 1.2.6-1 amd64 simple interprocess messaging system (Python interface)
ii python3-dbus 1.2.6-1 amd64 simple interprocess messaging system (Python 3 interface)
ii qdbus 4:4.8.7+dfsg-11 amd64 Qt 4 D-Bus tool
dbus 那个包的包名是 libdbus-1-3
, 之后下载这个包的源码:
$ apt-get source libdbus-1-3
现在在 dbus 源代码目录里, 找一个 execve()
, 或者它有可能用这个系统调用的变体.
$ grep -nHR execve
$ dbus/dbus-spawn.c:1072: execve (argv[0], argv, envp);
通过这个函数作为入口点, 一步步查找谁在调用这个函数, 最终找到了:
static dbus_bool_t
add_bus_environment (BusActivation *activation,
DBusError *error)
{
const char *type;
if (!bus_activation_set_environment_variable (activation,
"DBUS_STARTER_ADDRESS",
activation->server_address,
error))
return FALSE;
type = bus_context_get_type (activation->context);
if (type != NULL)
{
if (!bus_activation_set_environment_variable (activation,
"DBUS_STARTER_BUS_TYPE", type,
error))
return FALSE;
if (strcmp (type, "session") == 0)
{
if (!bus_activation_set_environment_variable (activation,
"DBUS_SESSION_BUS_ADDRESS",
activation->server_address,
error))
return FALSE;
}
else if (strcmp (type, "system") == 0)
{
if (!bus_activation_set_environment_variable (activation,
"DBUS_SYSTEM_BUS_ADDRESS",
activation->server_address,
error))
return FALSE;
}
}
return TRUE;
}
可以发现, dbus 创建的子进程, 其环境变量就两个.
但是, 另一个问题来了, 为何像 gtk 的程序, 并不存在这样的问题, Qt 程序怎么会这样?
下面来找一下 Qt 里的 QFileInfo::exists(const QString& filepath);
这个函数,
看内部是怎么调用的.
先下载 Qt5 的源代码:
$ dpkg -l | grep libqt5
ii libqt5concurrent5:amd64 5.10.1+dfsg-5 amd64 Qt 5 concurrent module
ii libqt5core5a:amd64 5.10.1+dfsg-5 amd64 Qt 5 core module
ii libqt5dbus5:amd64 5.10.1+dfsg-5 amd64 Qt 5 D-Bus module
ii libqt5designer5:amd64 5.10.1-2 amd64 Qt 5 designer module
ii libqt5designercomponents5:amd64 5.10.1-2 amd64 Qt 5 Designer components module
ii libqt5gui5:amd64 5.10.1+dfsg-5 amd64 Qt 5 GUI module
ii libqt5help5:amd64 5.10.1-2 amd64 Qt 5 help module
ii libqt5multimedia5:amd64 5.10.1-2 amd64 Qt 5 Multimedia module
ii libqt5multimedia5-plugins:amd64 5.10.1-2 amd64 Qt 5 Multimedia module plugins
ii libqt5multimediagsttools5:amd64 5.10.1-2 amd64 GStreamer tools for Qt 5 Multimedia module
ii libqt5multimediawidgets5:amd64 5.10.1-2 amd64 Qt 5 Multimedia Widgets module
ii libqt5network5:amd64 5.10.1+dfsg-5 amd64 Qt 5 network module
ii libqt5opengl5:amd64 5.10.1+dfsg-5 amd64 Qt 5 OpenGL module
ii libqt5opengl5-dev:amd64 5.10.1+dfsg-5 amd64 Qt 5 OpenGL library development files
ii libqt5positioning5:amd64 5.10.1+dfsg-3 amd64 Qt Positioning module
ii libqt5printsupport5:amd64 5.10.1+dfsg-5 amd64 Qt 5 print support module
ii libqt5qml5:amd64 5.10.1-4 amd64 Qt 5 QML module
ii libqt5quick5:amd64 5.10.1-4 amd64 Qt 5 Quick library
ii libqt5quickparticles5:amd64 5.10.1-4 amd64 Qt 5 Quick particles module
ii libqt5quicktest5:amd64 5.10.1-4 amd64 Qt 5 Quick Test library
ii libqt5quickwidgets5:amd64 5.10.1-4 amd64 Qt 5 Quick Widgets library
ii libqt5sensors5:amd64 5.10.1-3 amd64 Qt Sensors module
ii libqt5sql5:amd64 5.10.1+dfsg-5 amd64 Qt 5 SQL module
ii libqt5sql5-sqlite:amd64 5.10.1+dfsg-5 amd64 Qt 5 SQLite 3 database driver
ii libqt5svg5:amd64 5.10.1-2 amd64 Qt 5 SVG module
ii libqt5svg5-dev:amd64 5.10.1-2 amd64 Qt 5 SVG module development files
ii libqt5test5:amd64 5.10.1+dfsg-5 amd64 Qt 5 test module
ii libqt5waylandclient5:amd64 5.10.1-3 amd64 QtWayland client library
ii libqt5waylandcompositor5:amd64 5.10.1-3 amd64 QtWayland compositor library
ii libqt5webchannel5:amd64 5.10.1-2 amd64 Web communication library for Qt
ii libqt5webchannel5-dev:amd64 5.10.1-2 amd64 Web communication library for Qt - development files
ii libqt5webkit5:amd64 5.212.0~alpha2-9 amd64 Web content engine library for Qt
ii libqt5widgets5:amd64 5.10.1+dfsg-5 amd64 Qt 5 widgets module
ii libqt5x11extras5:amd64 5.10.1-2 amd64 Qt 5 X11 extras
ii libqt5x11extras5-dev:amd64 5.10.1-2 amd64 Qt 5 X11 extras development files
ii libqt5xml5:amd64 5.10.1+dfsg-5 amd64 Qt 5 XML module
可以看到, libqt5core5a
应该就是那个包了, 不得不说抱怨一下 Debian 把上游的包
拆得太分散了!
下载源代码:
$ apt-get source libqt5core5a
先在 Qt 源代码目录里找一下 QFileInfo
这个类, 因为 Qt 类的命名比较统一:
$ find -iname 'qfileinfo*'
./include/QtWidgets/5.10.1/QtWidgets/private/qfileinfogatherer_p.h
./include/QtCore/5.10.1/QtCore/private/qfileinfo_p.h
./include/QtCore/qfileinfo.h
./include/QtCore/QFileInfoList
./include/QtCore/QFileInfo
./tests/benchmarks/corelib/io/qfileinfo
./tests/benchmarks/corelib/io/qfileinfo/qfileinfo.pro
./tests/auto/corelib/io/qfileinfo
./tests/auto/corelib/io/qfileinfo/qfileinfo.pro
./tests/auto/corelib/io/qfileinfo/qfileinfo.qrc
./src/widgets/dialogs/qfileinfogatherer_p.h
./src/widgets/dialogs/qfileinfogatherer.cpp
./src/corelib/io/qfileinfo_p.h
./src/corelib/io/qfileinfo.h
./src/corelib/io/qfileinfo.cpp
可以看到 src/corelib/io/qfileinfo.cpp
就是我们要找的.
在 qfileinfo.cpp
里面找到 exists(const QString& filepath);
这个方法, 然后
一步步往上追, 最终就可以找到
//static
bool QFileSystemEngine::fillMetaData(const QFileSystemEntry &entry, QFileSystemMetaData &data,
QFileSystemMetaData::MetaDataFlags what)
{
if (Q_UNLIKELY(entry.isEmpty()))
return emptyFileEntryWarning(), false;
#if defined(Q_OS_DARWIN)
if (what & QFileSystemMetaData::BundleType) {
if (!data.hasFlags(QFileSystemMetaData::DirectoryType))
what |= QFileSystemMetaData::DirectoryType;
}
#endif
#ifdef UF_HIDDEN
if (what & QFileSystemMetaData::HiddenAttribute) {
// OS X >= 10.5: st_flags & UF_HIDDEN
what |= QFileSystemMetaData::PosixStatFlags;
}
#endif // defined(Q_OS_DARWIN)
// if we're asking for any of the stat(2) flags, then we're getting them all
if (what & QFileSystemMetaData::PosixStatFlags)
what |= QFileSystemMetaData::PosixStatFlags;
data.entryFlags &= ~what;
const QByteArray nativeFilePath = entry.nativeFilePath();
int entryErrno = 0; // innocent until proven otherwise
// first, we may try lstat(2). Possible outcomes:
// - success and is a symlink: filesystem entry exists, but we need stat(2)
// -> statResult = -1;
// - success and is not a symlink: filesystem entry exists and we're done
// -> statResult = 0
// - failure: really non-existent filesystem entry
// -> entryExists = false; statResult = 0;
// both stat(2) and lstat(2) may generate a number of different errno
// conditions, but of those, the only ones that could happen and the
// entry still exist are EACCES, EFAULT, ENOMEM and EOVERFLOW. If we get
// EACCES or ENOMEM, then we have no choice on how to proceed, so we may
// as well conclude it doesn't exist; EFAULT can't happen and EOVERFLOW
// shouldn't happen because we build in _LARGEFIE64.
union {
QT_STATBUF statBuffer;
struct statx statxBuffer;
};
int statResult = -1;
if (what & QFileSystemMetaData::LinkType) {
mode_t mode = 0;
statResult = qt_lstatx(nativeFilePath, &statxBuffer);
if (statResult == -ENOSYS) {
// use lstst(2)
statResult = QT_LSTAT(nativeFilePath, &statBuffer);
if (statResult == 0)
mode = statBuffer.st_mode;
} else if (statResult == 0) {
statResult = 1; // record it was statx(2) that succeeded
mode = statxBuffer.stx_mode;
}
if (statResult >= 0) {
if (S_ISLNK(mode)) {
// it's a symlink, we don't know if the file "exists"
data.entryFlags |= QFileSystemMetaData::LinkType;
statResult = -1; // force stat(2) below
} else {
// it's a reagular file and it exists
if (statResult)
data.fillFromStatxBuf(statxBuffer);
else
data.fillFromStatBuf(statBuffer);
data.knownFlagsMask |= QFileSystemMetaData::PosixStatFlags
| QFileSystemMetaData::ExistsAttribute;
data.entryFlags |= QFileSystemMetaData::ExistsAttribute;
}
} else {
// it doesn't exist
entryErrno = errno;
data.knownFlagsMask |= QFileSystemMetaData::ExistsAttribute;
}
data.knownFlagsMask |= QFileSystemMetaData::LinkType;
}
// second, we try a regular stat(2)
if (statResult == -1 && (what & QFileSystemMetaData::PosixStatFlags)) {
if (entryErrno == 0 && statResult == -1) {
data.entryFlags &= ~QFileSystemMetaData::PosixStatFlags;
statResult = qt_statx(nativeFilePath, &statxBuffer);
if (statResult == -ENOSYS) {
// use stat(2)
statResult = QT_STAT(nativeFilePath, &statBuffer);
if (statResult == 0)
data.fillFromStatBuf(statBuffer);
} else if (statResult == 0) {
data.fillFromStatxBuf(statxBuffer);
}
}
if (statResult != 0) {
entryErrno = errno;
data.birthTime_ = 0;
data.metadataChangeTime_ = 0;
data.modificationTime_ = 0;
data.accessTime_ = 0;
data.size_ = 0;
data.userId_ = (uint) -2;
data.groupId_ = (uint) -2;
}
// reset the mask
data.knownFlagsMask |= QFileSystemMetaData::PosixStatFlags
| QFileSystemMetaData::ExistsAttribute;
}
// third, we try access(2)
if (what & (QFileSystemMetaData::UserPermissions | QFileSystemMetaData::ExistsAttribute)) {
// calculate user permissions
auto checkAccess = [&](QFileSystemMetaData::MetaDataFlag flag, int mode) {
if (entryErrno != 0 || (what & flag) == 0)
return;
if (QT_ACCESS(nativeFilePath, mode) == 0) {
// access ok (and file exists)
data.entryFlags |= flag | QFileSystemMetaData::ExistsAttribute;
} else if (errno != EACCES && errno != EROFS) {
entryErrno = errno;
}
};
checkAccess(QFileSystemMetaData::UserReadPermission, R_OK);
checkAccess(QFileSystemMetaData::UserWritePermission, W_OK);
checkAccess(QFileSystemMetaData::UserExecutePermission, X_OK);
// if we still haven't found out if the file exists, try F_OK
if (entryErrno == 0 && (data.entryFlags & QFileSystemMetaData::ExistsAttribute) == 0) {
if (QT_ACCESS(nativeFilePath, F_OK) == -1)
entryErrno = errno;
else
data.entryFlags |= QFileSystemMetaData::ExistsAttribute;
}
data.knownFlagsMask |= (what & QFileSystemMetaData::UserPermissions) |
QFileSystemMetaData::ExistsAttribute;
}
#if defined(Q_OS_DARWIN)
if (what & QFileSystemMetaData::AliasType) {
if (entryErrno == 0 && hasResourcePropertyFlag(data, entry, kCFURLIsAliasFileKey))
data.entryFlags |= QFileSystemMetaData::AliasType;
data.knownFlagsMask |= QFileSystemMetaData::AliasType;
}
if (what & QFileSystemMetaData::BundleType) {
if (entryErrno == 0 && isPackage(data, entry))
data.entryFlags |= QFileSystemMetaData::BundleType;
data.knownFlagsMask |= QFileSystemMetaData::BundleType;
}
#endif
if (what & QFileSystemMetaData::HiddenAttribute
&& !data.isHidden()) {
QString fileName = entry.fileName();
if ((fileName.size() > 0 && fileName.at(0) == QLatin1Char('.'))
#if defined(Q_OS_DARWIN)
|| (entryErrno == 0 && hasResourcePropertyFlag(data, entry, kCFURLIsHiddenKey))
#endif
)
data.entryFlags |= QFileSystemMetaData::HiddenAttribute;
data.knownFlagsMask |= QFileSystemMetaData::HiddenAttribute;
}
if (entryErrno != 0) {
what &= ~QFileSystemMetaData::LinkType; // don't clear link: could be broken symlink
data.clearFlags(what);
return false;
}
return true;
}
这行是关键.
const QByteArray nativeFilePath = entry.nativeFilePath();
再往上追查, 发现:
void QFileSystemEntry::resolveNativeFilePath() const
{
if (!m_filePath.isEmpty() && m_nativeFilePath.isEmpty()) {
#ifdef Q_OS_WIN
QString filePath = m_filePath;
if (isRelative())
filePath = fixIfRelativeUncPath(m_filePath);
m_nativeFilePath = QFSFileEnginePrivate::longFileName(QDir::toNativeSeparators(filePath));
#elif defined(QFILESYSTEMENTRY_NATIVE_PATH_IS_UTF16)
m_nativeFilePath = QDir::toNativeSeparators(m_filePath);
#else
m_nativeFilePath = QFile::encodeName(QDir::toNativeSeparators(m_filePath));
#endif
#ifdef Q_OS_WINRT
while (m_nativeFilePath.startsWith(QLatin1Char('\\')))
m_nativeFilePath.remove(0,1);
if (m_nativeFilePath.isEmpty())
m_nativeFilePath.append(QLatin1Char('.'));
// WinRT/MSVC2015 allows a maximum of 256 characters for a filepath
// unless //?/ is prepended which extends the rule to have a maximum
// of 256 characters in the filename plus the preprending path
m_nativeFilePath.prepend("\\\\?\\");
#endif
}
}
关键的一行是:
m_nativeFilePath = QFile::encodeName(QDir::toNativeSeparators(m_filePath));
这行代码会调用 qfile.h
里的:
#if defined(Q_OS_DARWIN)
// Mac always expects filenames in UTF-8... and decomposed...
static inline QByteArray encodeName(const QString &fileName)
{
return fileName.normalized(QString::NormalizationForm_D).toUtf8();
}
static QString decodeName(const QByteArray &localFileName)
{
// note: duplicated in qglobal.cpp (qEnvironmentVariable)
return QString::fromUtf8(localFileName).normalized(QString::NormalizationForm_C);
}
#else
static inline QByteArray encodeName(const QString &fileName)
{
return fileName.toLocal8Bit();
}
static QString decodeName(const QByteArray &localFileName)
{
return QString::fromLocal8Bit(localFileName);
}
#endif
可以发现, 在 Darwin (Mac OS X) 平台上, 会使用 toUtf8()
这个方法把 QString
转换为 QByteArray
, 而其它平台上, 直接使用了 QString::toLocal8Bit()
方法.
坑就在这里.
当 Qt 进程运行时, 如果指定不同的编码, toLocal8Bit()
会返回不同的值!
qstring.cpp
里面这样注释的:
toLocal8Bit() returns an 8-bit string using the system's local encoding.
再追一下 toLocal8Bit()
具体做了什么:
QByteArray QString::toLocal8Bit_helper(const QChar *data, int size)
{
return qt_convert_to_local_8bit(QStringView(data, size));
}
static QByteArray qt_convert_to_local_8bit(QStringView string)
{
if (string.isNull())
return QByteArray();
#ifndef QT_NO_TEXTCODEC
QTextCodec *localeCodec = QTextCodec::codecForLocale();
if (localeCodec)
return localeCodec->fromUnicode(string);
#endif // QT_NO_TEXTCODEC
return qt_convert_to_latin1(string);
}
这里, 当 LANG
这个变量为空时, Qt 进程里, QTextCodec::codecForLocale()
返回
一个空指针, 然后上面的代码, 就会调用 qt_convert_to_latin1(string);
.
直接返回了 latin1
编码了的 QByteArray !
而我们期望它是以 utf8
的方式来编码的!
到这里, 就水落石出了.
另附, 将所有日志重定向到文件
他们的项目里混合了 ostream
和 QDebug
, 这是个不好的习惯. 当时为了方便调试,
直接将输出到终端的内容重定向到了一个日志文件, 大致是这样做的, 在 main()
入口:
int fd = open("/tmp/file-manager-daemon.log",
O_CREAT | O_WRONLY | O_SYNC,
S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
if (fd == -1) {
perror("open() failed to open log file!");
exit(1);
}
dup2(fd, STDOUT_FILENO);
dup2(fd, STDERR_FILENO);