Subject | Cannot access lock directory / Security database error |
---|---|
Author | Steffen Heil (Mailinglisten) |
Post date | 2018-03-30T18:44:03Z |
Hi
We are experiencing problems starting firebird and gfix at similar times.
We are running on Gentoo and during the start process, firebird is started. Well actually fbguard is started (as user firebird).
Then another service, which is run "After" firebird, is running gfix on command line (as user root).
After gfix completes, a connection is opened to the database gfix was working on.
That works 95% of the time. However in 1 of 20 cases, we experience a "Cannot access lock directory /tmp/firebird" and the service fails with "Security database error".
We traced that back to the call to gfix as root user.
gfix creates a folder /tmp/firebird and AFTER that it changes the ownership.
So for a short time, there is a folder /tmp/firebird with permissions 0700 belonging to root:root.
In that time firebird itself will fail to access this folder.
We added the following code right before the call to gfix:
mkdir -m 770 -p /tmp/firebird_new
chown firebird:firebird /tmp/firebird_new
mv -T /tmp/firebird_new /tmp/firebird 2>/dev/null
rm -rf /tmp/firebird_new
Now we do not experiencing that problem any more.
A possible (but untested) fix for this problem to be included in the firebird code is attached below.
While we did not experience that problem now for >500 tests (which are still running), I am unsure, if this really solved the problem.
Because files like fb_init and other files are still created with root ownership and only reowned to firebird afterwards, I expect the same problem could happen there.
My assumption is that the same schema used in the proposal below should be applied to all shared files: Create with temporary name, fix permissions, rename (if non existent).
Then there would never be a file not all participants could access.
Still I am wondering how that should work if gfix (or another tool) is called by any user different to root...
Any insights are welcome.
Best regards,
Steffen
// create directory for lock files and set appropriate access rights
void createLockDirectory(const char* pathname)
{
do
{
if (access(pathname, R_OK | W_OK | X_OK) == 0)
{
struct STAT st;
if (os_utils::stat(pathname, &st) != 0)
system_call_failed::raise("stat");
if (S_ISDIR(st.st_mode))
return;
// not exactly original meaning, but very close to it
system_call_failed::raise("access", ENOTDIR);
}
} while (SYSCALL_INTERRUPTED(errno));
char pathname2[MAXPATHLEN];
strcpy(pathname2, pathname);
strcat(pathname2, ".tmp.XXXXXX");
while (mkdir(pathname2) == NULL)
{
if (SYSCALL_INTERRUPTED(errno))
{
continue;
}
(Arg::Gds(isc_lock_dir_access) << pathname).raise();
}
changeFileRights(pathname2, 0770);
while (renameat2(AT_FDCWD, pathname2, AT_FDCWD, pathname, RENAME_NOREPLACE) != 0)
{
if (SYSCALL_INTERRUPTED(errno))
{
continue;
}
if (errno == EEXISTS)
{
while (rmdir(pathname2) != 0)
{
if (SYSCALL_INTERRUPTED(errno))
{
continue;
}
(Arg::Gds(isc_lock_dir_access) << pathname).raise();
}
return;
}
(Arg::Gds(isc_lock_dir_access) << pathname).raise();
}
}
We are experiencing problems starting firebird and gfix at similar times.
We are running on Gentoo and during the start process, firebird is started. Well actually fbguard is started (as user firebird).
Then another service, which is run "After" firebird, is running gfix on command line (as user root).
After gfix completes, a connection is opened to the database gfix was working on.
That works 95% of the time. However in 1 of 20 cases, we experience a "Cannot access lock directory /tmp/firebird" and the service fails with "Security database error".
We traced that back to the call to gfix as root user.
gfix creates a folder /tmp/firebird and AFTER that it changes the ownership.
So for a short time, there is a folder /tmp/firebird with permissions 0700 belonging to root:root.
In that time firebird itself will fail to access this folder.
We added the following code right before the call to gfix:
mkdir -m 770 -p /tmp/firebird_new
chown firebird:firebird /tmp/firebird_new
mv -T /tmp/firebird_new /tmp/firebird 2>/dev/null
rm -rf /tmp/firebird_new
Now we do not experiencing that problem any more.
A possible (but untested) fix for this problem to be included in the firebird code is attached below.
While we did not experience that problem now for >500 tests (which are still running), I am unsure, if this really solved the problem.
Because files like fb_init and other files are still created with root ownership and only reowned to firebird afterwards, I expect the same problem could happen there.
My assumption is that the same schema used in the proposal below should be applied to all shared files: Create with temporary name, fix permissions, rename (if non existent).
Then there would never be a file not all participants could access.
Still I am wondering how that should work if gfix (or another tool) is called by any user different to root...
Any insights are welcome.
Best regards,
Steffen
// create directory for lock files and set appropriate access rights
void createLockDirectory(const char* pathname)
{
do
{
if (access(pathname, R_OK | W_OK | X_OK) == 0)
{
struct STAT st;
if (os_utils::stat(pathname, &st) != 0)
system_call_failed::raise("stat");
if (S_ISDIR(st.st_mode))
return;
// not exactly original meaning, but very close to it
system_call_failed::raise("access", ENOTDIR);
}
} while (SYSCALL_INTERRUPTED(errno));
char pathname2[MAXPATHLEN];
strcpy(pathname2, pathname);
strcat(pathname2, ".tmp.XXXXXX");
while (mkdir(pathname2) == NULL)
{
if (SYSCALL_INTERRUPTED(errno))
{
continue;
}
(Arg::Gds(isc_lock_dir_access) << pathname).raise();
}
changeFileRights(pathname2, 0770);
while (renameat2(AT_FDCWD, pathname2, AT_FDCWD, pathname, RENAME_NOREPLACE) != 0)
{
if (SYSCALL_INTERRUPTED(errno))
{
continue;
}
if (errno == EEXISTS)
{
while (rmdir(pathname2) != 0)
{
if (SYSCALL_INTERRUPTED(errno))
{
continue;
}
(Arg::Gds(isc_lock_dir_access) << pathname).raise();
}
return;
}
(Arg::Gds(isc_lock_dir_access) << pathname).raise();
}
}