Skip to main content

Debugging the issue of using NFS shares for PSMDB on OpenShift


I have recently been trying to use PSMDB (Percona Server for MongoDB) as an open-source and free alternative for MongoDB Enterprise Server. I encountered an issue that the pod could not be initialized successfully with Persistent Volumes using NFS shares. I got the logs from the failed pod as follow:

------

++ id -u

++ id -g

+ install -o 1000730000 -g 0 -m 0755 -D /ps-entry.sh /data/db/ps-entry.sh

install: cannot change ownership of '/data/db/ps-entry.sh': Operation not permitted

----

I would like to share the steps how I used for debugging. The PSMD StatefulSet was deployed onto my OpenShift 3 OKD.

Check the container mount info

Go to a pod I could see the mount info as below

mongod-data → /data/db read-write

- mongod-data: Persistent volume claim name

- /data/db: container mounted directory

Check Persistent volume binding

Go to the storage, I could know which persistent volume was bound to the corresponding persistent volume claim.

Bound to volume psmdb-mongodb-data-0

Check Persistent volume

Go cluster console > storage > persistent volumes. I had

----

nfs:

 server: files.some.domain.local

 path: /srv/data/psmdb/mongodb/data-0

---

Check NFS shares (host’s mounted directory)

Remote to server and check attributes of the mounted directory

---

ssh someuser@files.some.domain.local

cd /srv/data/psmdb/mongodb

ls -la

---

The info was:

drwxrwxr-x. 2 nfsnobody nfsnobody 6 Jun 22 08:25 data-0

Check the user and group of "nfsnobody".

id nfsnobody

The info was:

uid=65534(nfsnobody) gid=65534(nfsnobody) groups=65534(nfsnobody)

Also, I could also check by "/etc/passwd"

nfsnobody:x:65534:65534:Anonymous NFS User:/var/lib/nfs:/sbin/nologin

Check container's mounted directory

Create a debug pod

---

# -c: container

# -t: attach a terminal

# --: use the entrypoint

oc debug my-cluster-name-rs0-0 -c mongo-init -t -- /bin/bash

---

Check the mounted directory attributes:

drwxrwxr-x. 2 65534 65534 25 Jun 22 08:43 db

It means the attributes are kept (the same as the host’s one). Check the files inside:

-rw-------. 1 1000920000 nfsnobody 15680 Jun 22 08:48 ps-entry.sh

Also, it was the same group as the host’s one. Only the user id was changed. It made sense. However, it seemed like the file was created but not completely set the attributes (ownership and file mode).

I checked the logged-in user of the container by executing the command "id"

uid=1000990000 gid=0(root) groups=0(root),1000990000

I ran the command from the entry point of the point ("/init-entrypoint.sh") manually within the container.

install -o "$(id -u)" -g "$(id -g)" -m 0755 -D /ps-entry.sh /data/db/ps-entry.sh

Then, I got an error:

install: cannot change ownership of '/data/db/ps-entry.sh': Operation not permitted

The currently logged-in user within the container doesn’t have permission to change the owner. Yep! I could reproduce the issue.

Hmm...What is the root cause?

The user of the container doesn't belong to group "65534(nfsnobody)", so it could not change the owner of the file. However, I could not do the following actions:

- Change group of container's user to "65543". OpenShift grants a fixed group "0 (root)" for the user.

- Modify the entry point contains the script "install -o ..." since the templates are generated and managed by the PSMDB operator.

Hence, the only way was that I need to change the file attributes of NFS shares. I changed it to group "0".

sudo chown nfsnobody:0 /srv/data/psmdb/mongodb/data-0

Remove the pod and deploy a new one. I got another error:

install: cannot create regular file '/data/db/ps-entry.sh': Permission denied

Strange! The current user has the same group "0" of the folder already. The NFS service doesn’t allow a user without group "nsfnobody" to create a file within a mounted directory.

I granted permission to write for everyone. :frowning::frowning:

sudo chmod o+w /srv/data/psmdb/mongodb/data-0

Another error...

install: cannot change ownership of '/data/db/ps-entry.sh': Operation not permitted

Check the file attributes of "/data/db/ps-enty.sh", I had

-rw-------. 1 1000990000 65534 15680 Jun 23 13:09 ps-entry.sh

As observed, it seemed like the NFS shares always use the group "65534" for the created files even the current user has group "0".

I set the attribute `s` to let Linux assign the group of the current folder to the nested files.

sudo chmod g+s /srv/data/psmdb/mongodb/data-0

It worked! Phew...

-rwxr-xr-x. 1 1000990000 root 15680 Jun 23 13:16 /data/db/ps-entry.sh

Comments

Popular posts from this blog

DevOps for Dummies

Everyone talks about it, but not everyone knows what it is. Why DevOps? In general, whenever an organization adopts any new technology, methodology, or approach, that adoption has to be driven by a business need. Any kind of system that need rapid delivery of innovation requires DevOps (development and operations). Why? DevOps requires mechanisms to get fast feedback from all the stakeholders in the software application that's being delivered. DevOps approaches to reduce waste and rework and to shift resources to higher-value activities. DevOps aims to deliver value (of organization or project) faster and more efficiently. DevOps Capabilities The capabilities that make up DevOps are a broad set that span the software delivery life cycle. The following picture is a reference architecture which provides a template of a proven solution by using a set of preferred methods and capabilities. My Remarks Okay, that sounds cool. What does it simply mean, again? The f...

AngularJS - Build a custom validation directive for using multiple emails in textarea

AngularJS already supports the built-in validation with text input with type email. Something simple likes the following: <input name="input" ng-model="email.text" required="" type="email" /> <span class="error" ng-show="myForm.input.$error.email"> Not valid email!</span> However, I used a text area and I wanted to enter some email addresses that's saparated by a comma (,). I had a short research and it looked like AngualarJS has not supported this functionality so far. Therefore, I needed to build a custom directive that I could add my own validation functions. My validation was done only on client side, so I used the $validators object. Note that, there is the $asyncValidators object which handles asynchronous validation, such as making an $http request to the backend. This is just my implementation on my project. In order to understand that, I supposed you already had experiences with ...

Junit - Test fails on French or German string assertion

In my previous post about building a regex to check a text without special characters but allow German and French . I met a problem that the unit test works fine on my machine using Eclipse, but it was fail when running on Jenkins' build job. Here is my test: @Test public void shouldAllowFrenchAndGermanCharacters(){ String source = "ÄäÖöÜüß áÁàÀâÂéÉèÈêÊîÎçÇ"; assertFalse(SpecialCharactersUtils.isExistSpecialCharater(source)); } Production code: public static boolean isExistNotAllowedCharacters(String source){ Pattern regex = Pattern.compile("^[a-zA-Z_0-9_ÄäÖöÜüß áÁàÀâÂéÉèÈêÊîÎçÇ]*$"); Matcher matcher = regex.matcher(source); return !matcher.matches(); } The result likes the following: Failed tests: SpecialCharactersUtilsTest.shouldAllowFrenchAndGermanCharacters:32 null A guy from stackoverflow.com says: "This is probably due to the default encoding used for your Java source files. The ö in the string literal in the J...

Creating a Chatbot with RiveScript in Java

Motivation "Artificial Intelligence (AI) is considered a major innovation that could disrupt many things. Some people even compare it to the Internet. A large investor firm predicted that some AI startups could become the next Apple, Google or Amazon within five years"   - Prof. John Vu, Carnegie Mellon University. Using chatbots to support our daily tasks is super useful and interesting. In fact, "Jenkins CI, Jira Cloud, and Bitbucket" have been becoming must-have apps in Slack of my team these days. There are some existing approaches for chatbots including pattern matching, algorithms, and neutral networks. RiveScript is a scripting language using "pattern matching" as a simple and powerful approach for building up a Chabot. Architecture Actually, it was flexible to choose a programming language for the used Rivescript interpreter like Java, Go, Javascript, Python, and Perl. I went with Java. Used Technologies and Tools Oracle JDK 1.8...

What the heck is Meteor DDP?

I was using Meteor for my messenger project. I was so curious about the real time connection. I wanted to know how exactly this mechanism works. In this post, I will go through the DDP Specification, an overview of WebSocket, and a simple demo about how to subscribe a publication of Rocket.Chat (containing a DDP server) from an external webpage. At a glance, I knew that Meteor invented a protocol called DDP which uses for handling real time connection. So then, what is DDP? "DDP (Distributed Data Protocol) is the stateful WebSocket protocol that Meteor uses to communicate between the client and the server." [1] All right! Why does DDP matter? "DDP is a standard way to solve the biggest problem facing client-side JavaScript developers: querying a server-side database, sending the results down to the client, and then pushing changes to the client whenever anything changes in the database" . [2] In order to understand deeply the protocol, I decided ...