Las bases de datos son bloat, es mejor usar un archivo texto plano y usar las herramientas de Unix para buscar informacion dentro de ellos.

Aquí tenés una guía práctica y realista para usar un archivo de texto plano como “base de datos” para un colegio, usando solo grep, awk, sed, sort, cut y herramientas Unix. La clave es formato bien definido, un registro por línea y campos separados de forma consistente.

✅ 1. Elegir un formato simple, estable y fácil de parsear

El mejor formato para “base de datos en texto plano” usando Unix es:

Formato: TSV (tab-separated values, separado por TAB)

Razones:

No choca con comas dentro del texto.
awk -F '\t' lo lee perfecto.
grep funciona igual.
Fácil de editar en Vim.

✅ 2. Estructura universal para cualquier tipo de registro

Usá un “schema” fijo:

TYPE    ID      DATE        FIELD1      FIELD2     FIELD3     EXTRA

Campos recomendados:

TYPE: el tipo de dato → DOCENTE, ALUMNO, ASISTENCIA, LICENCIA, AMON, TUTOR, etc.
ID: identificador único → DNI, legajo o código.
DATE: formato estándar → 2025-03-04
FIELD1..FIELDn: depende del tipo.
EXTRA: texto libre opcional.

✅ 3. Ejemplos reales

Docentes

DOCENTE  20300300    2025-03-01  ingreso     Juan Perez     Matemática
DOCENTE  20300300    2025-03-15  licencia    enfermedad     2-dias

Alumnos

ALUMNO   55022110    2025-03-01  ingreso     Lopez Maria    2B
TUTOR    55022110    2025-03-01  tutor       Carlos Lopez   padre

Asistencias

ASIST    55022110    2025-03-04  presente    2B
ASIST    55022110    2025-03-05  ausente     2B

Amonestaciones

AMON     55022110    2025-03-05  leve        celular        profesorX
AMON     55022110    2025-03-10  grave       conducta       preceptorY

✅ 4. Cómo consultar rápido con Unix

■ Todos los registros de un alumno:

grep -P "\t55022110\t" base2025.tsv

■ Solo asistencias:

grep "^ASIST" base2025.tsv

■ Asistencias de un alumno:

awk -F '\t' '$1=="ASIST" && $2=="55022110"' base2025.tsv

■ Cantidad de ausencias:

awk -F '\t' '$1=="ASIST" && $2=="55022110" && $4=="ausente" {c++} END {print c}' base2025.tsv

■ Sanciones de un curso:

awk -F '\t' '$1=="AMON" && $6=="2B"' base2025.tsv

✅ 5. Cómo agregar un registro desde Bash

Hacete un script:

`add.sh`

#!/bin/sh
echo -e "$@" >> base2025.tsv

Uso:

./add.sh "ASIST\t55022110\t2025-03-06\tpresente\t2B"

✅ 6. Cómo validar datos

Podés agregar checks con awk:

awk -F '\t' 'NF < 4 { print "Registro inválido:", $0}' base2025.tsv

✅ 7. Organización por año

Te recomiendo:

/datos-colegio/
    base2025.tsv
    base2026.tsv
    base2027.tsv

Un archivo grande por año es perfecto.

✅ 8. Opcional: separar por “tablas virtuales”

Si querés dividir un poco:

docentes2025.tsv
alumnos2025.tsv
asistencias2025.tsv
amonestaciones2025.tsv
tutores2025.tsv

Pero tu idea de un solo archivo es totalmente válida — solo necesitás el campo TYPE.

⚡ ¿Querés que te genere un schema completo y lista de comandos útiles para cada tipo de dato?

Puedo armarte un “manual del sistema” con todo predefinido.

You can absolutely use a plain text file as a database in Unix. This is classic Unix philosophy: “Text is the universal interface.” Below are practical patterns depending on how you want to store and query data — all without jq, without SQLite, without JSON — just text + grep/awk/sed.

✅ 1. Choose a Format (custom plain text)

The simplest and most robust flat-file database formats are:

A) KEY=VALUE (one record per line)

id=1 name="John Doe" age=30
id=2 name="Alice" age=25

B) CSV (comma separated)

1,John Doe,30
2,Alice,25

C) Pipe/Tab separated

1|John Doe|30
2|Alice|25

D) One file per record (directory database)

db/
  1.txt
  2.txt

Each file contains arbitrary text.

✅ 2. How to Query the “Database”

🔍 Search rows by field

KEY=VALUE format

Find all entries where name contains “Alice”:

grep 'name="Alice"' db.txt

Find id=2:

grep '^id=2 ' db.txt

CSV example

Find records whose first column is “2”:

awk -F, '$1 == 2' db.csv

🔄 3. Insert new record

Append:

echo 'id=3 name="Bob" age=50' >> db.txt

✏ 4. Update a record

Replace age for id=1:

sed -i 's/^id=1 .*/id=1 name="John Doe" age=31/' db.txt

❌ 5. Delete a record

Delete id=2:

sed -i '/^id=2 /d' db.txt

🔥 6. Example Flat-File Database Implementation

db.sh

#!/bin/sh

DB="db.txt"

add() {
    echo "id=$1 name=\"$2\" age=$3" >> "$DB"
}

get() {
    grep "^id=$1 " "$DB"
}

delete() {
    sed -i "/^id=$1 /d" "$DB"
}

update_age() {
    sed -i "s/^id=$1 .*/id=$1 name=\"$2\" age=$3/" "$DB"
}

Usage:

./db.sh add 1 "John Doe" 30
./db.sh add 2 "Alice" 25
./db.sh get 2
./db.sh update_age 1 "John Doe" 31
./db.sh delete 2

📂 7. Alternative: directory per record

db/
  1
  2

Create:

echo "name=Alice" > db/2
echo "age=25" >> db/2

Search:

grep -R "Alice" db/

⭐ When is this useful?

✔ shell scripts ✔ embedded devices ✔ tiny systems ✔ config management ✔ low data volume (<50k records) ✔ no dependencies ✔ grep/awk/sed are extremely fast

⚠ When it falls apart

❌ hundreds of thousands of rows ❌ concurrent writes ❌ complex queries (joins, ordering, types)

If you want, I can:

design a full mini-database system in shell
give you CRUD helpers
make an “SQL-like” interface in bash
design indexing for fast lookups (using grep caches)

Just tell me your ideal workflow.

Nice — you want a practical, file-based “database” using only Unix tools. Below is a focused, pragmatic guide + ready-to-run shell toolkit you can copy-paste and use. It covers data layout choices, common operations (CRUD + queries), performance & safety tips, and a simple script (plaindb.sh) that implements insert/find/update/delete on newline-delimited JSON (NDJSON) and on delimited CSV/TSV. I thought deeper about scale, concurrency, indexing, and schema evolution so you don’t get burned later.

Why plain files (quick pros/cons)

Pros

Simple, transparent, easy to version with git.
No DB server, low dependency surface.
Fast for many read-heavy workloads using grep/rg/awk/jq. Cons
No transactions unless you implement locking/atomic writes.
Harder to scale and do complex joins/queries compared to a DB.
Requires careful handling for concurrency, schema evolution, and backups.

Recommended file formats

NDJSON (newline-delimited JSON) — each line is a JSON object. Best for structured records, works great with jq.
- Example: users.ndjson
- {"id":1,"name":"A","email":"a@x.com"}
CSV/TSV — great for tabular data and fast parsing with awk, cut.
- Use \t (TSV) to avoid comma-escaping headaches.
Flat key-value — key:value per line for tiny lookup tables.
Inverted index files — precomputed index files to support fast full-text lookups.

Tools to use (short)

grep / rg (ripgrep) — text search
awk — field processing, aggregations, joins-ish
sed — in-place edits / transforms
jq — JSON querying and mutation
cut, sort, uniq, join — selection and joins on sorted files
flock — file locking for safe concurrent writes
mv/cp technique — atomic replace (mv tmp file)
gzip/xz — compress cold data
git — version your files for history & lightweight rollback
fzf — interactive selection

Patterns & examples

1) NDJSON: append, find, update, delete

File: users.ndjson

{"id":1,"name":"Alice","email":"alice@example.com"}
{"id":2,"name":"Bob","email":"bob@example.com"}

Insert (append safely)

# create a new record and append atomically
record='{"id":3,"name":"Carol","email":"carol@example.com"}'
printf '%s\n' "$record" >> users.ndjson
# better: use flock for multi-writer safety (see script below)

Find records

Full-text with grep:

grep -i 'alice' users.ndjson

Field-level with jq:

jq -c 'select(.email=="alice@example.com")' users.ndjson

Select columns (project)

jq -r '.id, .name' users.ndjson    # prints each field on a new line (not ideal)
jq -r '. | [.id, .name] | @tsv' users.ndjson  # id<TAB>name

Update a record (idempotent pattern)

Can’t modify in-place reliably — create a new file then move:

jq 'if .id==2 then .email="bob@new.com" else . end' users.ndjson > users.ndjson.tmp
mv users.ndjson.tmp users.ndjson

If multiple writers exist, use flock to protect the critical section (script later).

Delete

jq 'select(.id != 2)' users.ndjson > users.ndjson.tmp && mv users.ndjson.tmp users.ndjson

2) CSV/TSV with awk

File: products.tsv (header: idnameprice)

id	name	price
1	Widget	12.50
2	Gizmo	9.99

Find where price > 10

awk -F'\t' 'NR==1{print; next} $3+0 > 10' products.tsv

Group by and count

Count products by name prefix:

awk -F'\t' 'NR>1 {prefix=substr($2,1,3); counts[prefix]++} END{for (k in counts) print k, counts[k]}' products.tsv | sort -k2 -n -r

Join two files (both sorted by key)

join requires files sorted on join field:

# assume customers.tsv and orders.tsv keyed by customer_id in column1
join -t$'\t' -1 1 -2 1 <(sort -k1,1 customers.tsv) <(sort -k1,1 orders.tsv)

3) Indexing for speed (simple inverted index)

If you have lots of text and grep becomes slow / you need targeted search, create an index mapping token → list of record IDs.

Example pipeline to build a basic inverted index (tokenized lowercased words):

# input: docs.ndjson each line {"id":123, "text":"..."}
jq -r '. | [.id, .text] | @tsv' docs.ndjson \
  | tr '[:upper:]' '[:lower:]' \
  | awk -F'\t' '{id=$1; text=$2; gsub(/[^a-z0-9]+/," ",text); split(text, a, " "); for(i in a) if(length(a[i])>1) print a[i] "\t" id}' \
  | sort -k1,1 -u \
  | awk -F'\t' '{word=$1; id=$2; ids[word]=ids[word]?ids[word] "," id : id} END{for (w in ids) print w "\t" ids[w]}' \
  > index.tsv

Lookup:

grep -P '^keyword\t' index.tsv
# parse list of ids and then pull records from ndjson using jq or grep -Ff

4) Transactions & concurrency (flock + atomic move)

Example pattern for safe writes:

(
  flock -x 200 || exit 1
  # make changes in a tmp file
  jq '...update...' users.ndjson > users.ndjson.tmp
  mv users.ndjson.tmp users.ndjson
) 200>users.ndjson.lock

flock -x grabs exclusive lock on file descriptor 200.
Always write to a .tmp and mv to replace atomically.

5) Backups & snapshots

Periodic snapshots: cp users.ndjson users.$(date +%Y%m%d%H%M).ndjson
Lightweight: commit changes to git (git add *.ndjson && git commit -m "snapshot").
For big files, use incremental rsync to remote.

6) Validation & schema evolution

Keep a small schema file schema.json that documents required fields and types.
Validate new records with jq filter before appending:

jq -e '(.id|type=="number") and (.email|test("@"))' <<<"$candidate"

For migration: write a one-shot jq or awk migration script that produces a new file, test it, then replace.

Ready-to-use toolkit: `plaindb.sh`

Copy this script and put it in your $PATH (chmod +x plaindb.sh). It implements simple CRUD on an NDJSON file with locking.

#!/usr/bin/env bash
# plaindb.sh - minimal NDJSON "db" operations: insert, find, update, delete, list
# Usage: plaindb.sh <dbfile> <cmd> [args...]
# Requires: jq, flock, mktemp

DB="$1"; shift
CMD="$1"; shift

LOCK="${DB}.lock"

usage(){ cat <<EOF
Usage: $0 <dbfile> <cmd> [args...]
Commands:
  insert  <json>                 Append a JSON record (string)
  find    <jq-filter>            Print matching records (jq filter)
  list                           Print all records
  update  '<jq-update>'          Apply jq program to all records (e.g. 'if .id==2 then .name="X" else . end')
  delete  '<jq-condition>'       Delete records matching condition (jq select; e.g. '.id==2')
EOF
}

if [ -z "$DB" ] || [ -z "$CMD" ]; then usage; exit 1; fi
mkdir -p "$(dirname "$DB")" 2>/dev/null || true
: "${DB:=db.ndjson}"

with_lock() {
  # usage: with_lock <command...>
  exec 200>"$LOCK"
  flock -x 200
  "$@"
  flock -u 200
}

case "$CMD" in
  insert)
    json="$1"
    if ! echo "$json" | jq -e . >/dev/null 2>&1; then
      echo "invalid json" >&2; exit 2
    fi
    # append under lock
    with_lock bash -c "printf '%s\n' \"$json\" >> '$DB'"
    ;;
  find)
    filter="$1"
    if [ -z "$filter" ]; then filter='.'; fi
    # safe read, no write locking necessary
    jq -c "$filter" "$DB"
    ;;
  list)
    jq -c '.' "$DB"
    ;;
  update)
    prog="$1"
    tmp="$(mktemp "${DB}.tmp.XXXX")"
    with_lock bash -c "jq -c '$prog' '$DB' > '$tmp' && mv '$tmp' '$DB'"
    ;;
  delete)
    cond="$1"
    tmp="$(mktemp "${DB}.tmp.XXXX")"
    with_lock bash -c "jq -c \"select( ( $cond ) | not )\" '$DB' > '$tmp' && mv '$tmp' '$DB'"
    ;;
  *)
    usage; exit 1
esac

Examples:

# insert
./plaindb.sh users.ndjson insert '{"id":1,"name":"Alice"}'
# find by id
./plaindb.sh users.ndjson find 'select(.id==1)'
# update
./plaindb.sh users.ndjson update 'if .id==1 then .email="a@x.com" else . end'
# delete
./plaindb.sh users.ndjson delete '.id==1'

Advanced tips (thinking deeper)

Search performance: rg (ripgrep) is much faster than grep. For enormous files, split data by sharding keys (year, prefix), or maintain indexes.
Atomic counters: maintain a small file next_id and flock it to safely increment IDs.
Multi-field joins: sort files on join keys and use join. For complex joins, it’s often simpler to load small sets into memory with awk associative arrays.
Partial reads: store large blobs separately (e.g., blobs/<id>.dat) and keep metadata in the main NDJSON.
Full-text search: if you need advanced text queries (stemming, ranking), either maintain an index or use sqlite FTS (that’s a DB—but it’s a single file).
Monitoring & metrics: keep a metrics.log file with line-oriented events (timestamped) and roll it daily.
Immutable append-only log: prefer append-only files and periodic compaction—this makes crash recovery trivial. Compaction script reads file, removes tombstones, writes compacted file, swaps atomically.
Encryption: use gpg --encrypt for at-rest, gpg --decrypt when reading; or use age.
Testing & CI: add schema tests that jq-validate every new commit to the repo.

Quick cheatsheet

grep -i foo file — fast text search
jq -c 'select(.field=="x")' file — filter NDJSON
awk -F'\t' '$3>10' file.tsv — numeric filter on column 3
sort -k1,1 file | join -t$'\t' -1 1 -2 1 file2 — join on column1
flock -x 200; mv tmp file; — atomic replacement with lock
git add *.ndjson && git commit -m "snapshot" — snapshot your DB state

If you want, I can:

produce a slightly fancier CLI (with getopts, --where, --select) to make ad-hoc queries easier,
show how to build a tiny inverted index for full-text search and a lookup tool that returns top-N results,
or tailor the scripts for CSV/TSV only (no jq).

Which of those should I give you next?

If you want to use jq as your “database engine,” the best possible format is:

✅ NDJSON (Newline-Delimited JSON)

One JSON object per line:

users.ndjson:

{"id":1, "name":"Alice", "email":"alice@example.com"}
{"id":2, "name":"Bob", "email":"bob@example.com"}
{"id":3, "name":"Carol", "email":"carol@example.com"}

No commas, no arrays, no brackets — just one JSON per line. This is the format used by big data tools (Elasticsearch, Firehose, Hadoop, etc.).

✅ 1. FIND (query with jq)

List all records:

jq -c . users.ndjson

Find by name:

jq -c 'select(.name == "Alice")' users.ndjson

Case-insensitive:

jq -c 'select(.name | ascii_downcase == "alice")' users.ndjson

Find by ID:

jq -c 'select(.id == 2)' users.ndjson

Search email ending in .com:

jq -c 'select(.email | endswith(".com"))' users.ndjson

✅ 2. INSERT (append)

Append a new object:

echo '{"id":4, "name":"David", "email":"david@example.com"}' >> users.ndjson

✅ 3. DELETE a record

Delete ID=3:

jq -c 'select(.id != 3)' users.ndjson \
  > tmp && mv tmp users.ndjson

Delete every Bob:

jq -c 'select(.name != "Bob")' users.ndjson \
  > tmp && mv tmp users.ndjson

✅ 4. UPDATE a record

Example: change Bob’s email:

jq -c '
  if .id == 2
  then .email = "bob@newmail.com"
  else .
  end
' users.ndjson > tmp && mv tmp users.ndjson

Update multiple fields:

jq -c '
  if .id == 3
  then .name = "Carolina" | .email = "caro@example.com"
  else .
  end
' users.ndjson > tmp && mv tmp users.ndjson

✅ 5. SORT the “database”

Sort by ID:

jq -c . users.ndjson | sort -n -t: -k2

Better way (jq loads fully, sorts, outputs NDJSON):

jq -c -s 'sort_by(.id)[]' users.ndjson > tmp && mv tmp users.ndjson

(-s = “slurp” = load entire file)

Sort by name:

jq -c -s 'sort_by(.name)[]' users.ndjson > tmp && mv tmp users.ndjson

✅ 6. CREATE a reusable CLI tool

jsondb.sh:

#!/bin/sh
DB="$1"
CMD="$2"
shift 2

case "$CMD" in
  find)
    jq -c "select($1)" "$DB"
    ;;

  insert)
    echo "$1" >> "$DB"
    ;;

  delete)
    jq -c "select(($1) | not)" "$DB" > tmp && mv tmp "$DB"
    ;;

  update)
    filter="$1"   # e.g. '.id == 2'
    update="$2"   # e.g. '.email = "x@y.com"'
    jq -c "
      if ($filter) then $update else . end
    " "$DB" > tmp && mv tmp "$DB"
    ;;

  *)
    echo "Commands: find / insert / delete / update"
    ;;
esac

Example:

./jsondb.sh users.ndjson find '.id == 1'
./jsondb.sh users.ndjson insert '{"id":4,"name":"D","email":"d@mail"}'
./jsondb.sh users.ndjson delete '.id == 2'
./jsondb.sh users.ndjson update '.id == 3' '.name = "Carol X"'

Want more?

I can generate:

a full query language (where, select, limit, order by)
automatic primary key indexing
multi-table JSON structure
a “transaction log” version
CLI with subcommands and help text

Just tell me what you’d like to build next.

Las base de datos son bloat mejor usa un archivo de texto plano.

December 03, 2025 Db

✅ 1. Elegir un formato simple, estable y fácil de parsear

Formato: TSV (tab-separated values, separado por TAB)

✅ 2. Estructura universal para cualquier tipo de registro

Campos recomendados:

✅ 3. Ejemplos reales

Docentes

Alumnos

Asistencias

Amonestaciones

✅ 4. Cómo consultar rápido con Unix

■ Todos los registros de un alumno:

■ Solo asistencias:

■ Asistencias de un alumno:

■ Cantidad de ausencias:

■ Sanciones de un curso:

✅ 5. Cómo agregar un registro desde Bash

add.sh

✅ 6. Cómo validar datos

✅ 7. Organización por año

✅ 8. Opcional: separar por “tablas virtuales”

⚡ ¿Querés que te genere un schema completo y lista de comandos útiles para cada tipo de dato?

✅ 1. Choose a Format (custom plain text)

A) KEY=VALUE (one record per line)

B) CSV (comma separated)

C) Pipe/Tab separated

D) One file per record (directory database)

✅ 2. How to Query the “Database”

🔍 Search rows by field

KEY=VALUE format

CSV example

🔄 3. Insert new record

✏ 4. Update a record

❌ 5. Delete a record

🔥 6. Example Flat-File Database Implementation

db.sh

📂 7. Alternative: directory per record

⭐ When is this useful?

⚠ When it falls apart

If you want, I can:

Why plain files (quick pros/cons)

Recommended file formats

Tools to use (short)

Patterns & examples

1) NDJSON: append, find, update, delete

Insert (append safely)

Find records

Select columns (project)

Update a record (idempotent pattern)

Delete

2) CSV/TSV with awk

Find where price > 10

Group by and count

Join two files (both sorted by key)

3) Indexing for speed (simple inverted index)

4) Transactions & concurrency (flock + atomic move)

5) Backups & snapshots

6) Validation & schema evolution

Ready-to-use toolkit: plaindb.sh

Advanced tips (thinking deeper)

Quick cheatsheet

✅ NDJSON (Newline-Delimited JSON)

✅ 1. FIND (query with jq)

✅ 2. INSERT (append)

✅ 3. DELETE a record

✅ 4. UPDATE a record

✅ 5. SORT the “database”

✅ 6. CREATE a reusable CLI tool

Want more?

December 03, 2025
Db

`add.sh`

Ready-to-use toolkit: `plaindb.sh`