Las bases de datos son bloat, es mejor usar un archivo texto plano y usar las herramientas de Unix para buscar informacion dentro de ellos.
Aquí tenés una guía práctica y realista para usar un archivo de texto plano como “base de datos” para un colegio, usando solo grep, awk, sed, sort, cut y herramientas Unix. La clave es formato bien definido, un registro por línea y campos separados de forma consistente.
El mejor formato para “base de datos en texto plano” usando Unix es:
Razones:
awk -F '\t' lo lee perfecto.grep funciona igual.Usá un “schema” fijo:
TYPE ID DATE FIELD1 FIELD2 FIELD3 EXTRA
TYPE: el tipo de dato → DOCENTE, ALUMNO, ASISTENCIA, LICENCIA, AMON, TUTOR, etc.ID: identificador único → DNI, legajo o código.DATE: formato estándar → 2025-03-04FIELD1..FIELDn: depende del tipo.EXTRA: texto libre opcional.DOCENTE 20300300 2025-03-01 ingreso Juan Perez Matemática
DOCENTE 20300300 2025-03-15 licencia enfermedad 2-dias
ALUMNO 55022110 2025-03-01 ingreso Lopez Maria 2B
TUTOR 55022110 2025-03-01 tutor Carlos Lopez padre
ASIST 55022110 2025-03-04 presente 2B
ASIST 55022110 2025-03-05 ausente 2B
AMON 55022110 2025-03-05 leve celular profesorX
AMON 55022110 2025-03-10 grave conducta preceptorY
grep -P "\t55022110\t" base2025.tsv
grep "^ASIST" base2025.tsv
awk -F '\t' '$1=="ASIST" && $2=="55022110"' base2025.tsv
awk -F '\t' '$1=="ASIST" && $2=="55022110" && $4=="ausente" {c++} END {print c}' base2025.tsv
awk -F '\t' '$1=="AMON" && $6=="2B"' base2025.tsv
Hacete un script:
add.sh#!/bin/sh
echo -e "$@" >> base2025.tsv
Uso:
./add.sh "ASIST\t55022110\t2025-03-06\tpresente\t2B"
Podés agregar checks con awk:
awk -F '\t' 'NF < 4 { print "Registro inválido:", $0}' base2025.tsv
Te recomiendo:
/datos-colegio/
base2025.tsv
base2026.tsv
base2027.tsv
Un archivo grande por año es perfecto.
Si querés dividir un poco:
docentes2025.tsv
alumnos2025.tsv
asistencias2025.tsv
amonestaciones2025.tsv
tutores2025.tsv
Pero tu idea de un solo archivo es totalmente válida — solo necesitás el campo TYPE.
Puedo armarte un “manual del sistema” con todo predefinido.
You can absolutely use a plain text file as a database in Unix. This is classic Unix philosophy: “Text is the universal interface.” Below are practical patterns depending on how you want to store and query data — all without jq, without SQLite, without JSON — just text + grep/awk/sed.
The simplest and most robust flat-file database formats are:
id=1 name="John Doe" age=30
id=2 name="Alice" age=25
1,John Doe,30
2,Alice,25
1|John Doe|30
2|Alice|25
db/
1.txt
2.txt
Each file contains arbitrary text.
Find all entries where name contains “Alice”:
grep 'name="Alice"' db.txt
Find id=2:
grep '^id=2 ' db.txt
Find records whose first column is “2”:
awk -F, '$1 == 2' db.csv
Append:
echo 'id=3 name="Bob" age=50' >> db.txt
Replace age for id=1:
sed -i 's/^id=1 .*/id=1 name="John Doe" age=31/' db.txt
Delete id=2:
sed -i '/^id=2 /d' db.txt
#!/bin/sh
DB="db.txt"
add() {
echo "id=$1 name=\"$2\" age=$3" >> "$DB"
}
get() {
grep "^id=$1 " "$DB"
}
delete() {
sed -i "/^id=$1 /d" "$DB"
}
update_age() {
sed -i "s/^id=$1 .*/id=$1 name=\"$2\" age=$3/" "$DB"
}
Usage:
./db.sh add 1 "John Doe" 30
./db.sh add 2 "Alice" 25
./db.sh get 2
./db.sh update_age 1 "John Doe" 31
./db.sh delete 2
db/
1
2
Create:
echo "name=Alice" > db/2
echo "age=25" >> db/2
Search:
grep -R "Alice" db/
✔ shell scripts ✔ embedded devices ✔ tiny systems ✔ config management ✔ low data volume (<50k records) ✔ no dependencies ✔ grep/awk/sed are extremely fast
❌ hundreds of thousands of rows ❌ concurrent writes ❌ complex queries (joins, ordering, types)
Just tell me your ideal workflow.
Nice — you want a practical, file-based “database” using only Unix tools. Below is a focused, pragmatic guide + ready-to-run shell toolkit you can copy-paste and use. It covers data layout choices, common operations (CRUD + queries), performance & safety tips, and a simple script (plaindb.sh) that implements insert/find/update/delete on newline-delimited JSON (NDJSON) and on delimited CSV/TSV. I thought deeper about scale, concurrency, indexing, and schema evolution so you don’t get burned later.
Pros
NDJSON (newline-delimited JSON) — each line is a JSON object. Best for structured records, works great with jq.
users.ndjson{"id":1,"name":"A","email":"a@x.com"}CSV/TSV — great for tabular data and fast parsing with awk, cut.
\t (TSV) to avoid comma-escaping headaches.Flat key-value — key:value per line for tiny lookup tables.
Inverted index files — precomputed index files to support fast full-text lookups.
grep / rg (ripgrep) — text searchawk — field processing, aggregations, joins-ishsed — in-place edits / transformsjq — JSON querying and mutationcut, sort, uniq, join — selection and joins on sorted filesflock — file locking for safe concurrent writesmv/cp technique — atomic replace (mv tmp file)gzip/xz — compress cold datagit — version your files for history & lightweight rollbackfzf — interactive selectionFile: users.ndjson
{"id":1,"name":"Alice","email":"alice@example.com"}
{"id":2,"name":"Bob","email":"bob@example.com"}
# create a new record and append atomically
record='{"id":3,"name":"Carol","email":"carol@example.com"}'
printf '%s\n' "$record" >> users.ndjson
# better: use flock for multi-writer safety (see script below)
grep -i 'alice' users.ndjson
jq -c 'select(.email=="alice@example.com")' users.ndjson
jq -r '.id, .name' users.ndjson # prints each field on a new line (not ideal)
jq -r '. | [.id, .name] | @tsv' users.ndjson # id<TAB>name
Can’t modify in-place reliably — create a new file then move:
jq 'if .id==2 then .email="bob@new.com" else . end' users.ndjson > users.ndjson.tmp
mv users.ndjson.tmp users.ndjson
If multiple writers exist, use flock to protect the critical section (script later).
jq 'select(.id != 2)' users.ndjson > users.ndjson.tmp && mv users.ndjson.tmp users.ndjson
File: products.tsv (header: idnameprice)
id name price
1 Widget 12.50
2 Gizmo 9.99
awk -F'\t' 'NR==1{print; next} $3+0 > 10' products.tsv
Count products by name prefix:
awk -F'\t' 'NR>1 {prefix=substr($2,1,3); counts[prefix]++} END{for (k in counts) print k, counts[k]}' products.tsv | sort -k2 -n -r
join requires files sorted on join field:
# assume customers.tsv and orders.tsv keyed by customer_id in column1
join -t$'\t' -1 1 -2 1 <(sort -k1,1 customers.tsv) <(sort -k1,1 orders.tsv)
If you have lots of text and grep becomes slow / you need targeted search, create an index mapping token → list of record IDs.
Example pipeline to build a basic inverted index (tokenized lowercased words):
# input: docs.ndjson each line {"id":123, "text":"..."}
jq -r '. | [.id, .text] | @tsv' docs.ndjson \
| tr '[:upper:]' '[:lower:]' \
| awk -F'\t' '{id=$1; text=$2; gsub(/[^a-z0-9]+/," ",text); split(text, a, " "); for(i in a) if(length(a[i])>1) print a[i] "\t" id}' \
| sort -k1,1 -u \
| awk -F'\t' '{word=$1; id=$2; ids[word]=ids[word]?ids[word] "," id : id} END{for (w in ids) print w "\t" ids[w]}' \
> index.tsv
Lookup:
grep -P '^keyword\t' index.tsv
# parse list of ids and then pull records from ndjson using jq or grep -Ff
Example pattern for safe writes:
(
flock -x 200 || exit 1
# make changes in a tmp file
jq '...update...' users.ndjson > users.ndjson.tmp
mv users.ndjson.tmp users.ndjson
) 200>users.ndjson.lock
flock -x grabs exclusive lock on file descriptor 200..tmp and mv to replace atomically.cp users.ndjson users.$(date +%Y%m%d%H%M).ndjsongit add *.ndjson && git commit -m "snapshot").schema.json that documents required fields and types.jq filter before appending:jq -e '(.id|type=="number") and (.email|test("@"))' <<<"$candidate"
jq or awk migration script that produces a new file, test it, then replace.plaindb.shCopy this script and put it in your $PATH (chmod +x plaindb.sh). It implements simple CRUD on an NDJSON file with locking.
#!/usr/bin/env bash
# plaindb.sh - minimal NDJSON "db" operations: insert, find, update, delete, list
# Usage: plaindb.sh <dbfile> <cmd> [args...]
# Requires: jq, flock, mktemp
DB="$1"; shift
CMD="$1"; shift
LOCK="${DB}.lock"
usage(){ cat <<EOF
Usage: $0 <dbfile> <cmd> [args...]
Commands:
insert <json> Append a JSON record (string)
find <jq-filter> Print matching records (jq filter)
list Print all records
update '<jq-update>' Apply jq program to all records (e.g. 'if .id==2 then .name="X" else . end')
delete '<jq-condition>' Delete records matching condition (jq select; e.g. '.id==2')
EOF
}
if [ -z "$DB" ] || [ -z "$CMD" ]; then usage; exit 1; fi
mkdir -p "$(dirname "$DB")" 2>/dev/null || true
: "${DB:=db.ndjson}"
with_lock() {
# usage: with_lock <command...>
exec 200>"$LOCK"
flock -x 200
"$@"
flock -u 200
}
case "$CMD" in
insert)
json="$1"
if ! echo "$json" | jq -e . >/dev/null 2>&1; then
echo "invalid json" >&2; exit 2
fi
# append under lock
with_lock bash -c "printf '%s\n' \"$json\" >> '$DB'"
;;
find)
filter="$1"
if [ -z "$filter" ]; then filter='.'; fi
# safe read, no write locking necessary
jq -c "$filter" "$DB"
;;
list)
jq -c '.' "$DB"
;;
update)
prog="$1"
tmp="$(mktemp "${DB}.tmp.XXXX")"
with_lock bash -c "jq -c '$prog' '$DB' > '$tmp' && mv '$tmp' '$DB'"
;;
delete)
cond="$1"
tmp="$(mktemp "${DB}.tmp.XXXX")"
with_lock bash -c "jq -c \"select( ( $cond ) | not )\" '$DB' > '$tmp' && mv '$tmp' '$DB'"
;;
*)
usage; exit 1
esac
Examples:
# insert
./plaindb.sh users.ndjson insert '{"id":1,"name":"Alice"}'
# find by id
./plaindb.sh users.ndjson find 'select(.id==1)'
# update
./plaindb.sh users.ndjson update 'if .id==1 then .email="a@x.com" else . end'
# delete
./plaindb.sh users.ndjson delete '.id==1'
rg (ripgrep) is much faster than grep. For enormous files, split data by sharding keys (year, prefix), or maintain indexes.next_id and flock it to safely increment IDs.join. For complex joins, it’s often simpler to load small sets into memory with awk associative arrays.blobs/<id>.dat) and keep metadata in the main NDJSON.sqlite FTS (that’s a DB—but it’s a single file).metrics.log file with line-oriented events (timestamped) and roll it daily.gpg --encrypt for at-rest, gpg --decrypt when reading; or use age.jq-validate every new commit to the repo.grep -i foo file — fast text searchjq -c 'select(.field=="x")' file — filter NDJSONawk -F'\t' '$3>10' file.tsv — numeric filter on column 3sort -k1,1 file | join -t$'\t' -1 1 -2 1 file2 — join on column1flock -x 200; mv tmp file; — atomic replacement with lockgit add *.ndjson && git commit -m "snapshot" — snapshot your DB stateIf you want, I can:
getopts, --where, --select) to make ad-hoc queries easier,Which of those should I give you next?
If you want to use jq as your “database engine,” the best possible format is:
One JSON object per line:
users.ndjson:
{"id":1, "name":"Alice", "email":"alice@example.com"}
{"id":2, "name":"Bob", "email":"bob@example.com"}
{"id":3, "name":"Carol", "email":"carol@example.com"}
No commas, no arrays, no brackets — just one JSON per line. This is the format used by big data tools (Elasticsearch, Firehose, Hadoop, etc.).
List all records:
jq -c . users.ndjson
Find by name:
jq -c 'select(.name == "Alice")' users.ndjson
Case-insensitive:
jq -c 'select(.name | ascii_downcase == "alice")' users.ndjson
Find by ID:
jq -c 'select(.id == 2)' users.ndjson
Search email ending in .com:
jq -c 'select(.email | endswith(".com"))' users.ndjson
Append a new object:
echo '{"id":4, "name":"David", "email":"david@example.com"}' >> users.ndjson
Delete ID=3:
jq -c 'select(.id != 3)' users.ndjson \
> tmp && mv tmp users.ndjson
Delete every Bob:
jq -c 'select(.name != "Bob")' users.ndjson \
> tmp && mv tmp users.ndjson
Example: change Bob’s email:
jq -c '
if .id == 2
then .email = "bob@newmail.com"
else .
end
' users.ndjson > tmp && mv tmp users.ndjson
Update multiple fields:
jq -c '
if .id == 3
then .name = "Carolina" | .email = "caro@example.com"
else .
end
' users.ndjson > tmp && mv tmp users.ndjson
Sort by ID:
jq -c . users.ndjson | sort -n -t: -k2
Better way (jq loads fully, sorts, outputs NDJSON):
jq -c -s 'sort_by(.id)[]' users.ndjson > tmp && mv tmp users.ndjson
(-s = “slurp” = load entire file)
Sort by name:
jq -c -s 'sort_by(.name)[]' users.ndjson > tmp && mv tmp users.ndjson
jsondb.sh:
#!/bin/sh
DB="$1"
CMD="$2"
shift 2
case "$CMD" in
find)
jq -c "select($1)" "$DB"
;;
insert)
echo "$1" >> "$DB"
;;
delete)
jq -c "select(($1) | not)" "$DB" > tmp && mv tmp "$DB"
;;
update)
filter="$1" # e.g. '.id == 2'
update="$2" # e.g. '.email = "x@y.com"'
jq -c "
if ($filter) then $update else . end
" "$DB" > tmp && mv tmp "$DB"
;;
*)
echo "Commands: find / insert / delete / update"
;;
esac
Example:
./jsondb.sh users.ndjson find '.id == 1'
./jsondb.sh users.ndjson insert '{"id":4,"name":"D","email":"d@mail"}'
./jsondb.sh users.ndjson delete '.id == 2'
./jsondb.sh users.ndjson update '.id == 3' '.name = "Carol X"'
I can generate:
Just tell me what you’d like to build next.