Some frequently used operations, listed here for convenience:

Adding SSH Key

(generate a key):

ssh-keygen -t ed25519

add pubkey to remote machine:

echo "your-ssh-key" >> ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys

adding a private key:

ssh-add your_file.pem

Disabling password login

only allowing login via authorized ssh keys:

nano /etc/ssh/sshd_config
 
# PasswordAuthentication no
# ChallengeResponseAuthentication no   <- if exists, modify this
# UsePAM no
 
# systemctl restart sshd  <- use below if not working
service ssh restart

Sync files to s3 every 2 minutes

one method:

watch -n 120 ~/s5cmd sync . "s3://dataset-ingested/datagen_workspace/02_nai_default/"

another way:

# 自动定时上传训练的到s3
while true; do aws s3 sync ./ s3://bucket-external/model_store/fulldan_artstation_600k_test/  && sleep 900; done
 
# 在aws机器上定时同步文件
while true; do aws s3 sync s3://bucket-external/model_store/fulldan_artstation_600k_test/ ./  && sleep 900; done

Resilient SSH Tunnel

use it in a tmux session for consistent port forwarding:

while true; do   echo "🌐 Attempting SSH tunnel at $(date)...";   ssh -o ServerAliveInterval=30 -o ServerAliveCountMax=3 -L 6333:localhost:6333 ubuntu@35.86.100.156 || echo "❌ SSH exited. Retrying in 5 seconds...";   sleep 5; done

use command:

gallery-dl --mtime-from-date --write-metadata --write-info-json --filter "width >= 512 and height >= 512 and extension not in ('mp4', 'gif') and date > datetime(2019, 1, 1) and favorite_count > 20" --cookies-from-browser edge URL

Killing VSCode

After running a server for too long, vscode servers could be slow.

Command to kill all:

ps uxa | grep .vscode-server | awk '{print $2}' | xargs kill -9

Using HF CLI

download to local dir:

huggingface-cli download  animetimm/danbooru-wdtagger-v4-w640-ws-30k  --repo-type dataset   --local-dir danbooru-wdtagger-v4-w640-ws-30k
huggingface-cli download  deepghs/wdtagger-v4-extended-seed  --repo-type dataset   --local-dir deepghs_wdtagger-v4-extended-seed

upload local dir to hf:

huggingface-cli upload  --repo-type dataset --private  incantor/danbooru-wdtagger-v4-w640-ws-30k    danbooru-wdtagger-v4-w640-ws-30k
huggingface-cli upload  --repo-type dataset --private  incantor/wdtagger-v4-extended-seed    deepghs_wdtagger-v4-extended-seed

Remove locally installed packages

one-liner to remove packages installed as editable under /local/:

python -c "import sys,os,json,site,sysconfig,glob,subprocess; from urllib.parse import urlparse; from urllib.request import url2pathname; \
import importlib, importlib.util; \
try: import importlib.metadata as im \
except Exception: import importlib_metadata as im; \
LP='/local/'; found=set(); \
for d in im.distributions(): \
    try: txt=d.read_text('direct_url.json'); \
    except Exception: txt=None; \
    try: \
        if txt: \
            j=json.loads(txt); \
            if j.get('dir_info',{}).get('editable',False): \
                u=j.get('url') or ''; p=url2pathname(urlparse(u).path) if u.startswith('file://') else ''; \
                if p.startswith(LP): \
                    try: found.add(d.metadata['Name']) \
                    except Exception: pass \
    except Exception: pass; \
sd=set(); \
try: sd.update(site.getsitepackages()) \
except Exception: pass; \
for k in ('purelib','platlib'): p=sysconfig.get_paths().get(k); \
if p: sd.add(p); \
try: sd.add(site.getusersitepackages()) \
except Exception: pass; \
for sp in list(sd): \
    for egg in glob.glob(os.path.join(sp,'*.egg-link')): \
        try: \
            with open(egg,'r',encoding='utf-8',errors='ignore') as f: first=f.readline().strip(); \
            if first.startswith(LP): found.add(os.path.basename(egg).split('.egg-link')[0]) \
        except Exception: pass; \
pkgs=sorted(found); \
print('No editable /local packages found.') or sys.exit(0) if not pkgs else None; \
print('Uninstalling:',' '.join(pkgs)); \
subprocess.check_call([sys.executable,'-m','pip','uninstall','-y',*pkgs])"