Extend-the-end `+` vs. `.01`, `.02`

Addendum: This excerpt from Desiderata for Controlled Medical Vocabularies (Cimino, 1998) highlights the need for non-semantic concept identifiers. I think it ties nicely into this discussion.

Non-Semantic Concept Identifiers

Because many vocabularies are organized into strict hierarchies, there has been an irresistible temptation to make the unique identifier a hierarchical code which reflects the concept’s position in the hierarchy. For example, a concept with the code 1000 might be the parent of the concept with the code 1200 which, in turn might be the parent of the concept 1280, and so on. One advantage to this approach is that, with some familiarity, the codes become some-what readable to a human and their hierarchical relationships can be understood. With today’s computer inter-faces, however, there is little reason why humans need to have readable codes or, for that matter, why they even need to see the codes at all. Another advantage of hierarchical codes is that querying a database for members of a class becomes easier (e.g., searching for “all codes beginning with 1” will retrieve codes 1000, 1200, 1280, and so on). However, this advantage is lost if the concepts can appear in multiple places in the hierarchy (see “Polyhierarchy”, below); fortunately, there are other ways to perform “class-based” queries to a database which will work even when concepts can be in multiple classes [32].

There are several problems with using the concept identifier to convey hierarchical information. First, it is possible for the coding system to run out of room. A decimal code, such as the one described above, will only allow ten concepts at any level in the hierarchy and only allow a depth of four [34]. Coding systems can be designed to avoid this problem, but other problems remain. For example, once assigned a code, a concept can never be reclassified without breaking the hierarchical coding scheme. Even more problematic, if a concept belongs in more than one location in the hierarchy (see “Polyhierarchy”, below), a convenient single hierarchical identifier is no longer possible. It is desirable, therefore, to have the unique identifiers for the concepts which are free of hierarchical or other implicit meaning (i.e., nonsemantic concept identifiers); such information should instead be included as attributes of the concepts [14].

Conclusion

Johnny Decimal strikes a balance between semantic encoding (areas, categories) and non-semantic identifiers at the item level. It takes the best of both worlds.

The system establishes immutable, mutually exclusive classes with a maximum depth of two levels. Everything beneath that is flat. Concepts at this level may have semantic relationships with each other, but JD codes don’t attempt to represent them. Instead, move the meaning of the ID into the concept’s definition (JD note).

This is intentional: below the category level, meaning becomes fluid and subject to change. That’s where the system stops encoding semantics — to preserve stability.

1 Like

This is a Sacred Truth. @Alex heard a Stack Overflow engineer say it on a podcast. “We never actually reduce complexity”, she said. “We just move it around.” Or words to that effect. He may be able to find us the podcast link.

And we must not forget that search exists! Sure it’s nice to just neatly follow your path down through areas and categories to IDs but if you get there and :thinking: it’s not there, we can immediately fall back to search. The thing you’re looking for should absolutely be in a note’s title, and if it isn’t, consider adding it.

Also your tool shop sounds amazing and it’s my life’s goal to see it in person.

:grin:. I actually just removed the one + I had in my personal system because after the discussion we had I realised I didn’t need it.

I guess I thought I’d take a + for a spin and see how I felt about it. But, in this particular case, I have plenty of room for new IDs in my travel category so I just made a new ID.

1 Like

Another example from trawling an old Discord thread. I’d suggested that there might be a gap in Life Admin for managing other people’s IT stuff. I help my bestie out with her small business so I manage her domains and backups.

Answer from @aviskase was, no: this is textbook EtE. Because what I’m managing is still 14.14 My data storage & backups, but not mine, someone else’s.

So I have 14.14+MGH … – her initials – and I +MGH anything else of hers I manage, and now I can filter all of those out trivially with search.

This is what EtE is for.

1 Like

Another one (noting here as this might make its way to the website in the future so this is my scratchpad now).

LAS has 12.14 Inventory. So you might have a collection of clothes, wine, figurines, crochet patterns, whatever. Which you record in an inventory.

These things are all inventories. So you might like to keep them separate and EtE them under 12.14.

nice clarification.

1 Like

Just to manage expectations, it’s amazing primarily for the fact that it works at all in its square footage. But I look forward to giving you the tour.

Thanks for the quote! I’ve come to the conclusion that for me, the decimal numbers are mainly for the computer to ensure things stay in order. I remember the names and use those. But it’s good to have the distinction/continuum spelled out, to realize the other way of seeing it can still be important.

Also re: @Don_F:
And then there may be another line at the top, the name of the individual person being addressed. I presume that we won’t want to put the names of all people who work somewhere in the structure; rather, we will have them listed in our address book with their address up to the suite/room level as an attribute.

2 Likes

There’s some kind of metaphor about the suite address being where the postal delivery ends, and delivery to the person being an internal function of the office, but I’m not yet fully awake so take my word for it.

1 Like

I managed to miss your longer comment above. This is a really helpful metaphor.

FYI this thread spawned a question that deserved its own thread.

Sorry to bump this thread. The above discussion is about extending the index, but I’m still nog quite sure how to handle this in a filesystem.

Is the sub id added to the id at that same level, or is it an actual subfolder? And if the latter, how would I name this?

So, for example:

Alternative 1:

10-19 Some area
  14 Some category
    14.53 Some ID
    14.53+01 The extended ID
    14.53+02 Another extension

or

Alternative 2:

10-19 Some area
  14 Some category
    14.53 Some ID
      14.53+01 The extended ID
      14.53+02 Another extension

or

Alternative 3:

10-19 Some area
  14 Some category
    14.53 Some ID
      01 The extended ID
      02 Another extension

I’m asking this, since when I use Onedrive for Business to share a link to a folder, the first and second alternative have the full ID in the shared link (the folder name is the clickable text in an email), but the first might clutter the category in the file system. Alternative 3 is more concise imho, and uses fewer characters in the file path. (I occasionally run into having to long file paths.)

Looking forward to your opinions or, even better, experiences.

Hi, you might want to take a look at this. The third option is recommended by Johnny.Decimal: The Sub-ID only is included in the folder name.

Personally, I avoid Sub-IDs because they lead to re-indexing if a child item becomes its own thing with a full AC.ID. I have outlined an alternative approach to Sub-IDs here.

2 Likes

No apology needed, there are no dead threads here. They just nap while they wait for us to come back to them.

I still have a bit of thinking to do here but I like, and use, and can recommend, subfolders that only contain the + part, and literally start with a +! For example I have:

11.14 Licences
   └─ + Drivers licence

I can spare the extra character, and this really makes it clear that this is an extended thing. I know what to expect in my JDex. And everything after the + still sorts like it would without it, so it doesn’t really affect anything. It doesn’t interfere.

And it says to me that this isn’t just an un-considered subfolder. It’s a legitimate thing. I find some comfort in that.

I don’t have many of these yet, so there’s not a lot of data, but I’m pretty comfortable with this.

3 Likes

Hi, I’m mostly using “Alternative 1” and in some cases “Alternative 2”

The latter usually in places where I consider the “extensions” important enough to see them at a first glance when browsing through the index / file system / wiki / whatever.

The former mostly in the case of many extensions.

Also I try not to mix them but yeah things happen. :sweat_smile: