JD index text format - ideas and comments, please

johnnydecimal · March 21, 2021, 8:51pm

The problem

You need an index of your items. I bang on about this on the site, and I really mean it. Why?

My system shouldn’t restrict where you keep things or what tools or sites you use. But this is where an index is critical, and this demonstrates why really nicely.

The (suggested) solution

First, the solution. And this is the endorsed-by-JD, will-be-supported-by-software format. Do this:

10-19 Area whatever
   11 Category you-know-what
   11.01 My item in one place
   - Location: Pinterest[1]
   11.02 My item in another place
   - Location: Google Drive
   11.03 One more item
   - Location: MacBook/~

[1]: And you’ll be able to [markdown](URL here) format links here so that they become clickable.

You could also leave yourself a note against any item.

   11.04 My item
   - Note: Whatever I want to remember.

Basically after any item you can enter - [Key]: [Value] and that’ll be valid.

Comments, please

Can this thread become the place where we talk about, wish for, criticise, find problems with, pick apart, find the bugs, etc., in my suggested canonical JD text index format.

I’ll document that format more formally shortly. I’m still drinking my morning tea.

jack.baty · March 21, 2021, 9:08pm

I like the idea of having a “blessed” format, especially once tooling is involved. If I were more ambitious, I would write an Emacs mode for it. On the other hand, I’ve already considered doing it right in Org mode instead, since that’s already an outliner.

But yeah, a text-based format will do nicely.

johnnydecimal · March 21, 2021, 9:33pm

Here’s my current thinking. I’ll keep this post as the one single record for this, but if I update it I’ll pop a reply at the bottom. (Unless you Discourse veterans have a better way? Lemme know.)

Current status of this post: started, i.e. not finished!

Goals

Allow people to keep an ‘index’ of their JD system. It must allow for the full PRO.AC.ID format if required.
Be able to be used without any extra tools, i.e. in any text editor.
a. Be useable as the index in its raw form, i.e. be formatted in a way that makes it immediately useable.
Be trivially transferable, i.e. be copy-pasteable to any other tool.
Be machine-parseable so you can use software if you want. That might be mine, your own, someone else’s, whatever.

Solution

Plain-text.
The language defines itself because of the structured nature of the JD numbers.

AC.ID format

10-19 This is an area (x0-x9)
   10 This is a category (xy, where x matches the area above)
      10.01 This is an ID (xy.id), where xy matches the category above)

PRO.AC.ID format

100 This is a project (zzz)
    10-19 This is an area (x0-x9)
       10 This is a category (xy, where x matches the area above)
          10.01 This is an ID (xy.id), where xy matches the category above)
101 This is the next project
    // ...and so on

Categories must ‘belong’ to their parent area, and IDs must ‘belong’ to their parent categories.
Numbers may not duplicate; if there are duplicates, the file is invalid.
Numbers must be in order; if they are out of order, the file is invalid.
Numbers may be skipped. Category 13 may directly follow 11.
Whitespace is optional. Any given tool may format for you, as shown above, but it is semantically meaningless.
Additional metadata may be specified on any item as follows:

100 This is a project // or any other type of item
- Note: This is a note
- Location: This is a location [1]
- Anything else: This is any other text

[1]: Markdown-style hyperlinks []() are supported in metadata text.

The format is as follows:

A single dash symbol. (Must be the dash, whatever the Unicode is.)
A space.
The name of the ‘key’, e.g. ‘Note’.
- This may include spaces but obviously may not include…
A colon.
A space.
The ‘value’, e.g. ‘This is my note’.

Note that the Markdown-link-style is supported but it is up to any given implementation whether to do anything with it.

Blank lines are ignored (but should be kept by any parser — if the user wants them, don’t remove them).
Markdown-style dividers are supported: three or more hyphens, asterisks, or underscores on a line by themselves.
JavaScript-style inline comments are supported after an item, e.g.
11 My category // This is a comment.

In this case the item is 11 My category and This is a comment is recorded as an inline comment.
The multi-line syntax /* comment */ is not supported. (Should it be?)

That’ll do for a start, I have to head out. Initial thoughts? Please pick this apart, what have I missed?

Edit: added 11.
Edit: added 2.a.

bbfile · March 23, 2021, 6:01pm

The recfile format is another option. It’s plain text and very readable and there are free tools for querying/updating the content if you want to automate something.

johnnydecimal · March 23, 2021, 7:04pm

Interesting, thanks for the link. I don’t think that’ll work, however, because you made me realise what I missed…

I’m just about to add 2.a. to the Goals above:

2.a. Be useable as the index in its raw form, i.e. be formatted in a way that makes it immediately useable.

Recfile doesn’t do that, but I can see myself using it for other things at work. Looks neat.

bbfile · March 23, 2021, 7:23pm

As I was writing it I did think, “Oh this doesn’t look like a tree”, but good to bounce ideas around anyway. I’m currently putting all my comic books in a recfile so it’s on my mind

johnnydecimal · March 23, 2021, 7:26pm

All the ideas help, I want all the ideas please.

dixonge · April 1, 2021, 1:12pm

I’ve just been banging my head against the Hazel rules wall in order to try and create a dynamically updating index list in plain text. So far a solution has completely eluded me. Rules such as ‘when name changes’ do not even seem to work. Bash scripts work, if I can come up with the appropriate regex-fu, but I’m unsure how to trigger them based on a folder being added, removed, moved or renamed.

Any ideas?

bbfile · April 1, 2021, 4:35pm

I haven’t tried it, but I had considered using watchman to do something like this.

The quickstart example on the main page seems like a similar case:

These two lines establish a watch on a source directory and then set up a trigger named buildme that will run a tool named minify-css whenever a CSS file is changed. The tool will be passed a list of the changed filenames.
$ watchman watch ~/src
$ watchman – trigger ~/src buildme ‘*.css’ – minify-css

johnnydecimal · April 2, 2021, 1:36am

Oh, interesting. I’ll have a look at that … sounds similar to the thing I built for work last week which takes a DOS dir /b /s /a:d output and turns it in to HTML which you paste in to a SharePoint wiki page. Hang on while I sanitise and upload for anyone interested…

Here we go: github/johnnydecimal/sharepoint-decimaliser-sanitised. Hideous code but it might inspire someone.

This has me thinking though, maybe I should write some sort of command-line thing using deno.

Angelo · April 11, 2021, 4:31pm

Hello! New to the forum and the system, but reading the first post in this topic, it struck me that it looks similar to the TaskPaper format, or, more broadly, OPML (on which TaskPaper’s format is based, IIRC?).

The nice thing about OPML is that it’s meant to be fully transportable between any outlining software, and the nested folders of the Johnny.Decimal system are exactly that: an outline. Here’s a sample “directory” OPML file that could be a jumping off point, for example:

<?xml version="1.0" encoding="ISO-8859-1"?>
<opml version="2.0">
	<head>
		<title>JohnnyDecimalIndex.opml</title>
	</head>
	<body>
		<outline text="Johnny Decimal Index"/>
			<outline text="100-199 Personal" type="link" url="file:///Users/Angelo/Documents/100-199 Personal"/>
			<outline text="200-299 Business" type="link" url="file:///Users/Angelo/Documents/200-299 Business"/>
			<!-- Not going the full PRO.AC.ID depth, but you get the idea -->
	</body>
</opml>

Having a url property means it can point to web addresses, or even any file/app with a URL scheme (e.g., mindnode://open?name=JohnnyDecimaSystem.mindnode).

The not-so-nice thing about OPML is that it’s based on XML, so… it’s not exactly human-readable. But it’s trivial for software to parse, is already widely supported by outliners and feed-readers, and has been around for over a decade, so there are a tonne of libraries for reading/writing OPML in whatever programming/scripting language.

johnnydecimal · April 11, 2021, 9:47pm

I like the principle, hate the XML. But it’s a good idea: ensure that the human-readable, daily-use JD format can be automatically parsed by a dingus in to XML. There’s nothing in the human form that can’t be expressed in the XML form.

Then if you need one or the other, you just run your system either way through the translator (which could just be an API on the ‘net somewhere, Netlify Functions or what have you).

Angelo · April 11, 2021, 10:27pm

Yeah, pretty sure XML stands for “eXcruciating Markup Language”.

mattcanty · May 4, 2021, 8:09am

Hi everyone.

I got here via the JD Across Multiple Platforms post and I’ve been thinking about this for a few weeks.

I have several devices (Windows/Mac/Google Drive) which I want all to adhere to an indexed structure (JD!).

Having had a play around with it I keep coming round to the realisation that really it is the index which must be established first. I keep moving files around and expecting a eureka moment, but it ain’t going to happen…

I started writing a command line app in Golang. During the first day I experimented with existing machine-readable formats YAML, JSON & TOML. Unfortunately they all create clutter which we’d rather not have to deal with.

Here’s what I am considering. Firstly, my tool will allow you to create a new JD ID and manage the index for you. If you want to hand write the index then the tool can be used afterwards to validate it. Perhaps even some light fixing for whitespace. Secondly, I intend to use a Git repository to store the index remotely. This means that each time I use the tool, it checks first for changes to the index in Git. This way I can have confidence in running the commands across my different operating systems.

So with all of this… perhaps the first thing to do is start a GitHub repository where we can collaborate on the index schema?

Future Thoughts
Suppose I buy a house in the UK. Well, someone else has already defined a JD index area before which has all of the categories. Perhaps I could re-use that

johnnydecimal · May 5, 2021, 9:47pm

I’ve done some more thinking on this recently, and had cause to set up a new system which gave me more data. I’ll type up thoughts on the weekend.

Indeed. A library of areas or categories that you can just go and grab.

johnnydecimal · May 9, 2021, 7:02am

Okay, have a squiz at this. Very draft-y, open to all comments or suggestions.

github.com

johnnydecimal/johnnydecimal.com/blob/feature/jd-language-spec/docs/language-spec/index.md

# JD language spec

> Status: draft.

# What is this?

I want anyone to be able to write JD software, and I want to be able to transfer data easily between this different software. So this file defines the official Johnny.Decimal language specification. If you write software that reads or writes a file that looks like this, it should work with any other software that does the same.

## Goals

1. Enable simple transfer of data between programs by having the 'database' just be a text file,
2. which is as human-readable as possible; you mustn't *require* a program to maintain the file.

# Spec

> Note: defined terms are **bold**, see 'Definitions' below.

## White space is irrelevant

White space may be used to make the file more readable -- it is encouraged, and suggested rules are documented below -- but from a parsing perspective, white space is irrelevant and must be ignored.

This file has been truncated. show original

mattcanty · May 15, 2021, 12:59pm

I wasn’t watching the thread so missed your message. Comments:

Metadata - I’d suggest allowing single-line comments. I see a use case for the metadata to be read by tools which manage data. Suppose that a metadata could be akin to struct tags in Go.

00-09: Area with big files
  - processor: s3_backup // I don't want big files kept on this device

Thought: do we allow this?

000 Project
    00-09 Area
       00 Category
   000.00.00 ID    // Use the full PRO.AC.ID here

Is it the indentation that you are questioning to allow? Seeing the above actually makes me think that indentation should be enforced.

Shared indexes could be managed by conflicting preferences and lead to competing updates. I would recommend a strict indentation.

I have tended to find the combination of ranges and numbers a little cumbersome. Take the following example:

100-199: Personal
  100: Home
    10-19: Finances
      10: Mortgage
        100.10.00: Contract
        100.10.01: Statements
    20-29: Management
      20: Service Charges & Rent
        100.20.00: Statements
200-299: Business
  etc

Do headings help?

100-199: Personal
200-299: Business

100: Home
  10-19: Finances
  20-29: Management

  10: Mortgage
    100.10.00: Contract
    100.10.01: Statements
  20: Service Charges & Rent
    100.20.00: Statements

Otherwise I think it’s looking good. I’m approaching a gap between clients in July/August, I’d be really happy to work on a parser for JD.

In other news I am still working on my index. It’s deceptively hard! Today I cut out lots of bits of paper and have taken over the dining table!

johnnydecimal · May 15, 2021, 10:35pm

mattcanty:

Metadata - I’d suggest allowing single-line comments. I see a use case for the metadata to be read by tools which manage data. Suppose that a metadata could be akin to struct tags in Go.
00-09: Area with big files
  - processor: s3_backup // I don't want big files kept on this device

That’s a good idea. Thinking of how the underlying data structure might look (i.e. is it nightmarishly hard to implement or not), you could do

// The simple version, without a comment
const myData = {
  "00-09": "Area with big files"
}

// The full version, with a comment
const myData = {
  "00-09": {
    title: "Area with big files",
    comment: "I don’t want big files kept on this device"
  }
}

So yep, endorsed, I’ll add it to the spec later today.

johnnydecimal · May 15, 2021, 10:38pm

mattcanty:

000 Project
    00-09 Area
       00 Category
   000.00.00 ID    // Use the full PRO.AC.ID here
Is it the indentation that you are questioning to allow? Seeing the above actually makes me think that indentation should be enforced.

Sorry, wasn’t clear. I was wondering whether we allow the full PRO.AC.ID number to be put in the file like that, but specifically only in front of an ID.

Because you can always infer the full number as a result of your position in the file, but sometimes you’d also like to see it, right there.

Interesting point on indentation, I was on the fence with that one.