Building a folder sync tool in Go (and the saga of the ghost folders)

Repository: github.com/LuizFernando991/go-sync-folder

For a long time I wanted a simple way to share the same folders across my devices: my Obsidian vault, a few directories of everyday stuff, that kind of thing. I could have used Dropbox, Drive, Syncthing… but two things came together: (1) Oracle Cloud offers a surprisingly generous free VM, and (2) what I really wanted was to learn Go more deeply. So instead of installing something ready-made, I decided to build it.

The starting point was this video: How To Build A Complete Distributed File Storage In Golang — a genuinely long tutorial (we're talking hours) that builds a distributed file storage system in Go from scratch. It gave me the mental model; from there the project took its own direction, focused on my use case: mirroring folders between machines through a self-hosted central server.

An honest disclaimer: this started as simple experiments and I kept refining it by trial and error until it became code I consider good. You can still find rough corners, leftovers from the experiments. And I didn't treat this as a product — so feel free to adapt it. The most glaring example: the "authentication" is just a token compared in constant time. I've built robust, secure auth many times before; to learn, I'd rather spend energy on the core of the problem (synchronization) than on reinventing login.

What it does

The idea is a "minimalist, self-hosted Dropbox for Linux":

A central server (syncdrive-server) holds the authoritative copy of the files and a manifest (the list of files and folders, with hash, size, and timestamp).
A daemon (syncdrive-daemon) runs on each machine, watches local folders, and keeps them mirrored with the server — files and directories, including empty ones.

        ┌───────────────────────────────────────────────┐
        │               syncdrive-server                │
        │   files/   +   manifest.json (files & dirs)   │
        └───────────────▲─────────────────▲─────────────┘
                        │  HTTP + Bearer  │
            ┌───────────┘                 └────────────┐
   ┌────────┴──────────┐                     ┌─────────┴────────┐
   │  syncdrive-daemon │                     │ syncdrive-daemon │
   │  watcher + workers│                     │ watcher + workers│
   └───────────────────┘                     └──────────────────┘
        machine A                                  machine B

Three modes per folder: two-way (upload and download), push (upload only), and pull (download only).

The data model: the manifest

Everything revolves around lightweight metadata. No shipping a file just to detect a change — we compare metadata and only transfer what actually changed.

type FileMeta struct {
	Path    string    `json:"path"`
	Size    int64     `json:"size"`
	SHA256  string    `json:"sha256"`
	ModTime time.Time `json:"mod_time"`
}

type DirMeta struct {
	Path    string    `json:"path"`
	ModTime time.Time `json:"mod_time,omitempty"`
}

type Manifest struct {
	Files map[string]FileMeta `json:"files"`
	Dirs  map[string]DirMeta  `json:"dirs,omitempty"`
}

The server keeps this manifest. Each client additionally keeps local state in .syncdrive/state.json: a snapshot of "how things looked at the last successful sync." That state is what lets us tell a real deletion apart from a file that never arrived. Hold on to that detail — it's the heart of everything.

The core: the three-way merge

The central question of any sync tool is: given a path, what do I do? Upload? Download? Delete? Ignore?

The answer comes from comparing three views of the same path:

Local — what's on disk right now.
Remote — what the server manifest says.
State — what was true at the last sync.

With those three, you can infer intent. Example: the file exists on the server, doesn't exist locally, and was in the state → that means I deleted it since the last sync, so I should delete it on the server. But if it exists on the server, doesn't exist locally, and was not in the state → it's a new file from another machine, so I should download it.

All of that decision lives in a single pure function, decideFile. Concentrating the logic in one place was one of the project's best decisions (it used to be duplicated and drifting between two code paths):

func decideFile(mode FolderMode, l FileMeta, hasLocal bool, r FileMeta, hasRemote bool, old FileMeta, hadOld bool) (fileAction, FileMeta) {
	switch {
	case hasLocal && !hasRemote:
		if mode == ModePull {
			return actKeep, l
		}
		if hadOld && old.SHA256 == l.SHA256 {
			return actDeleteLocal, l // was in state and vanished from remote → deletion
		}
		return actUpload, l // new local file → upload

	case !hasLocal && hasRemote:
		if mode == ModePush {
			return actKeep, r
		}
		if hadOld && old.SHA256 == r.SHA256 {
			return actDeleteRemote, r
		}
		return actDownload, r

	case hasLocal && hasRemote:
		// ... two-way: if the hashes match, nothing to do.
		if l.SHA256 != "" && l.SHA256 == r.SHA256 {
			return actKeep, l
		}
		localChanged := !hadOld || old.SHA256 != l.SHA256
		remoteChanged := !hadOld || old.SHA256 != r.SHA256
		switch {
		case localChanged && remoteChanged: // real conflict
			if l.ModTime.After(r.ModTime) {
				return actUpload, l // the newer one wins
			}
			return actDownload, r
		case localChanged:
			return actUpload, l
		case remoteChanged:
			return actDownload, r
		}
	}
	return actNone, FileMeta{}
}

The conflict policy is "the most recent wins" by ModTime, with the server as the tiebreaker when timestamps are equal (so all machines converge). It's simple, predictable, and good enough for my use.

A detail I liked: the disk scan doesn't recompute the SHA-256 of everything on every pass. If the size and modification time match the state, it reuses the already-known hash. Hashing a large file for nothing is expensive; this avoids it.

Concurrency and parallelism: why this is great for learning Go

This is where the project got genuinely fun, and where Go shines. It's got goroutines, channels, worker pools, select, sync.Mutex, sync.WaitGroup — the whole package.

A worker pool for transfers

Transfers (upload/download) run in parallel in a pool. Downloads go before uploads, and within each kind the smaller files go first — so a huge upload doesn't starve the small ones:

ch := make(chan syncJob)
var wg sync.WaitGroup

for range workers {
	wg.Add(1)
	go func() {
		defer wg.Done()
		for job := range ch { // receive work over the channel
			switch job.kind {
			case jobUpload:
				result, err = s.upload(root, job.path, job.meta)
			case jobDownload:
				err = s.download(root, job.path, job.meta)
			}
			// ...write the result into the next state (guarded by a mutex)
		}
	}()
}

for _, job := range jobs {
	ch <- job // dispatch
}
close(ch)  // close → workers finish the range and exit
wg.Wait()  // wait for all of them

This is the classic fan-out pattern with a channel: the producer sends jobs, N workers consume. close(ch) + range is the idiomatic way to signal "we're done."

The continuous daemon: a pool that never sleeps

In daemon mode I wanted something more reactive: an edit to a .txt should upload immediately, even if a large upload is occupying another worker. So the FolderSyncer keeps a permanent pool of workers, and a scan just enqueues jobs and moves on — it doesn't wait for the transfers.

That brought a lovely concurrency problem: how do you coalesce scans? If 10 filesystem events arrive in a row, I don't want 10 stacked scans — I want one, after the last change. The solution is a "pending" flag:

func (fs *FolderSyncer) Trigger() {
	fs.scanMu.Lock()
	if fs.scanRunning {
		fs.scanPending = true   // a scan is already running? mark it to re-run at the end
		fs.scanMu.Unlock()
		return
	}
	fs.scanRunning = true
	fs.scanMu.Unlock()

	go func() {
		for {
			fs.scan()
			fs.scanMu.Lock()
			if !fs.scanPending {     // nobody asked again → stop
				fs.scanRunning = false
				fs.scanMu.Unlock()
				return
			}
			fs.scanPending = false   // a request arrived during the scan → run once more
			fs.scanMu.Unlock()
		}
	}()
}

It guarantees one scan at a time (no race) and at least one after the last change (without losing the "edge"). Honest coalescing in ~15 lines.

Watching the filesystem

The trigger comes from a watcher built on fsnotify, which accumulates "dirty" paths and signals without blocking:

func (w *Watcher) signal(path string) {
	w.dirtyMu.Lock()
	w.dirty[path] = struct{}{}
	w.dirtyMu.Unlock()
	select {
	case w.events <- struct{}{}: // buffer-1 channel: "there's news"
	default:                     // a signal is already pending → don't block
	}
}

That select with default is a trick I've grown to love in Go: send on a channel if possible, without ever blocking.

The race that nearly got me: stopping the pool

There's a treacherous subtlety: if you close the jobs channel (close(jobsCh)) on shutdown while a scan can still enqueue a job, you get a panic: send on closed channel. The idiomatic fix is to not close the jobs channel; instead, close a done channel and have everyone listen to it:

func (fs *FolderSyncer) Stop() { close(fs.done) } // that's it

func (fs *FolderSyncer) worker() {
	for {
		select {
		case <-fs.done:           // shut down
			return
		case job := <-fs.jobsCh:  // or do work
			// ...
		}
	}
}

// and the send observes done too:
go func() {
	select {
	case fs.jobsCh <- job:
	case <-fs.done: // aborting: still release the WaitGroup
		fs.jobsWg.Done()
	}
}()

Small, but it's exactly the kind of concurrency bug that only shows up under load — and that teaches you to think in Go.

How the sync runs today (a full scan)

Let me be upfront about a current limitation: the authoritative reconciliation is a full scan. On every sync, the daemon walks the entire local folder tree (filepath.WalkDir) and fetches the complete manifest from the server, builds the three views (local, remote, state), and runs BuildPlan over the union of all paths.

There's a shortcut: when the watcher reports that a specific file changed, I fast-path just that path so it uploads right away — but I still kick off the full scan afterward to reconcile directories and anything the shortcut can't see.

In practice, for normal-sized folders (Obsidian, documents) this is fast and works really well: comparing metadata is cheap, the scan reuses hashes by size+mtime, and polling uses an ETag (304 Not Modified) so it doesn't refetch the manifest for nothing. But of course, walking the whole tree doesn't scale for free to millions of files.

I'm still looking for alternatives to improve this part — things like true incremental reconciliation (trusting the watcher events more), a change index/journal, or diffing only the affected subtrees. For now, the full scan is simple, predictable, and serves me well — so I left it that way on purpose, until I find a better approach worth the extra complexity.

The saga of the ghost folders 👻

This was the problem that taught me the most, and the main reason for this article.

In the first version, I synced files only. Directories were "inferred" from file paths. It sounds clever and economical — and it works, until it doesn't.

Symptom: I'd delete an entire folder on one machine, and on the other only the files disappeared. The empty folder stayed there, stranded. Worse: empty folders never synced at all. I nicknamed the bug the ghost folder.

Why did it happen? Because a directory wasn't a real entity in the system. The server even had a Dirs field in the manifest, but it wiped it on every save (clear(s.manifest.Dirs)) and didn't even return it from the API. The client never recorded folders. There were half-finished functions, dead code from earlier attempts. Folder removal relied on a fragile heuristic of "clean up directories that became empty during this sync" — which didn't cover an intentionally empty folder, nor the case where the folder was already empty beforehand.

The turning point was treating a directory as a first-class citizen, exactly like a file:

The scan started recording every folder (including empty ones).
The server started persisting and returning Dirs, and recording ancestor folders when a file is uploaded.
Reconciliation became a pure planner, BuildPlan, that produces a complete plan: uploads, downloads, deletions, and directory creation/removal.

The same three-way merge used for files now applies to folders. And order matters: create folders shallow-to-deep (parent before child), remove them deep-to-shallow (child before parent). Here's a slice of the plan handling a folder that only exists on the remote:

case !hasL && hasR:
	if hadOld {
		// existed at the last sync and vanished locally → mirror the removal,
		// unless there's still live remote content inside it
		if hasRemoteContent {
			p.NextDirs[key] = rd
		} else {
			p.RmdirRemote = append(p.RmdirRemote, key)
		}
	} else {
		// new remote folder: if it's already "implied" by a file, the
		// download creates it; otherwise it needs an explicit mkdir (empty folder)
		if hasRemoteContent {
			p.NextDirs[key] = rd
		} else {
			p.MkdirLocal = append(p.MkdirLocal, key)
		}
	}

Note two details I'm proud of:

An empty folder is the only thing that needs an explicit mkdir. A folder with a file gets created for free when the file is written — so I don't waste calls.
The delete-vs-create conflict guard. If I delete a folder on one machine, but another adds a new file inside it at the same time, the file wins and the folder survives. Locally this falls out naturally because os.Remove only removes an empty directory: if there's new content, it refuses and I move on.

func (s *Syncer) removeLocalDir(root, dir string) {
	err := os.Remove(filepath.Join(root, filepath.FromSlash(dir)))
	if err == nil || errors.Is(err, os.ErrNotExist) || isNotEmpty(err) {
		return // success, already gone, or had new content → all good
	}
	s.log("rmdir %s: %v", dir, err)
}

Result: creating, emptying, or deleting folders — including empty ones — mirrors symmetrically across devices. Ghost exorcised. And the pure planner became trivial to test (feed it three maps, check the plan), which gave me a huge safety net.

Other details worth the click

A few little things I learned to appreciate:

Hashing while streaming. On upload, the body flows through an io.TeeReader that feeds the SHA-256 at the same time it sends. I don't read the file twice, and I verify at the end that the local hash matches what the server computed.
Atomic writes. A downloaded file goes to a temporary path, and only then an atomic os.Rename moves it into place. You never end up with a half-written file at the destination.
Detecting a file changing mid-upload. A stableFileReader checks, during the send, whether size/mtime changed — if they did, it cancels the upload instead of shipping something inconsistent.
The .downloading placeholder. While downloading, a file.downloading appears in the folder, cleaned up automatically when the download finishes, fails, or the daemon restarts. Cheap visual feedback.
Conditional manifest with ETag. Polling sends If-None-Match; if nothing changed on the server, it answers 304 Not Modified. Saves bandwidth for free.
Path traversal protection. Paths with .., absolute paths, or \ are rejected before touching the disk.

What I deliberately left out

So I don't oversell it:

Auth is just a token compared with subtle.ConstantTimeCompare. Fine behind HTTPS for personal use; it is not an account system. Swapping it for something robust is a separate exercise.
No resumable transfers. A large file that's interrupted restarts from scratch. It could be done in chunks, but that was off-focus.
It's not a product. It's a learning project that I actually use. Adapt it however you like.

Why it's a great project for learning Go

If you want to level up in Go beyond "CRUD with net/http", I recommend building something like this. In a single project, you brush up against:

Real concurrency and parallelism: worker pools, channel fan-out, sync.Mutex/RWMutex, WaitGroup, and the done-channel pattern for clean shutdown.
Idiomatic select: non-blocking sends, cancellation, event coalescing.
net/http on both sides: a server with ServeMux, streaming large bodies, conditional headers (ETag/304), and a client reusing connections.
Serious I/O: io.Reader/io.Writer, TeeReader, MultiWriter, atomic writes, streaming hashing.
Testable design: extracting the decision into a pure function (BuildPlan) and covering the hard cases without spinning up any server.
Hunting concurrency bugs: running go test -race and discovering, under load, that send on closed channel that never shows up on the happy path.

And, perhaps most important: a problem with real depth. "Syncing folders" sounds trivial until you face deletions, conflicts, empty folders, and races. That's where the learning lives.

Wrapping up

I started out just wanting to sync my Obsidian vault across machines, taking advantage of a free Oracle VM, and ended up with a sync tool I understand end to end — one that taught me more Go than any standalone tutorial. The code is open at github.com/LuizFernando991/go-sync-folder; clone it, break it, improve it, make it your own.

If you do dig in, my advice: start with decideFile and BuildPlan. That's where the system "thinks."