{"success": true, "mode": "discover", "query": "\"6597630553\" message incoming", "results": [{"session_id": "20260607_073841_ac3216ce", "when": "June 07, 2026 at 07:38 AM", "source": "webhook", "model": "glm-5.1", "title": null, "matched_role": "user", "match_message_id": 104155, "snippet": "...true`, they are replying to a >>>message<<< Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an >>>incoming<<< WhatsApp >>>message<<< is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification...", "bookend_start": [], "messages": [{"id": 104155, "role": "user", "content": "[IMPORTANT: The user has invoked the \"whatsapp-evolution\" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]\n\n---\nname: whatsapp-evolution\ndescription: WhatsApp messaging via Evolution API — send, receive, analyze, and auto-reply. Use when processing an incoming WhatsApp message delivered via the webhook, replying on Vincent's behalf, liaising with contacts, sending messages, or handling WhatsApp-related tasks. Covers intent analysis, auto-reply rules, escalation (VIP → phone call), conversation context fetching, and draft approvals via the jimmy-wa-drafts sidecar.\n---\n\n# WhatsApp via Evolution API\n\n## Credentials\nStored in `/opt/data/secrets/.whatsapp-evolution-credentials`:\n- `EVOLUTION_API_URL`, `EVOLUTION_API_KEY`, `EVOLUTION_INSTANCE`\n\nLoad with: `export $(cat .whatsapp-evolution-credentials | xargs)`\n\n## Sending Messages (direct, no approval)\n\n```bash\nexport $(cat .whatsapp-evolution-credentials | xargs)\ncurl -s -X POST \"${EVOLUTION_API_URL}/message/sendText/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"number\": \"6591234567\", \"text\": \"Hello!\"}'\n```\n\n- Phone format: country code, no `+` (e.g. `6590054438`)\n- Groups: use JID `{groupid}@g.us`\n- For media/location/contacts — see [references/api.md](references/api.md)\n\nUse direct send only when Vincent has explicitly commanded/approved the exact send, or for pre-approved auto-replies/follow-ups. For any reply that needs Vincent's review, use the sidecar (next section).\n\n### Direct replies when Vincent names an existing WhatsApp thread\n\nIf Vincent says to “reply to X” or “confirm with X” and has clearly approved the action, but the contact JID/number is not in the current prompt:\n\n1. Search recent session history first (`session_search`) for the contact name, clinic/business name, or appointment details. Prior webhook summaries often contain the exact `remoteJid`, alternate phone JID, sender name, and context.\n2. If session history identifies the WhatsApp phone JID, send directly only if Vincent’s instruction is explicit enough to act. Keep the text concise and natural.\n3. If only a LID JID is found, look for `remoteJidAlt`/alternate phone JID in the session summary or fetch contacts/chats before sending.\n4. Do not invent phone numbers from web search when replying to an existing WhatsApp conversation. Use web search only as supporting context, not as the destination, unless Vincent explicitly asks to contact a public number.\n\n\n### Sending trip/camp details from retrieved emails\n\nWhen Vincent explicitly asks to send extracted school trip/camp details to a child or family member, send directly. Keep it practical and grounded:\n- Include dates, reporting/pick-up times, flights, venue/itinerary highlights, deadlines, forms, and safety/medication notes when found.\n- If an official packing list is missing, say that clearly. Do not invent one. You may add a short “based on itinerary/weather, prepare…” list, clearly labeled as practical guidance rather than official school instructions.\n- Avoid long explanations about the search process in WhatsApp. Send the useful details.\n\n### Sending a local image/file as media\n\nFor user-approved media sends, base64-encode the local file and call `sendMedia`:\n\n```bash\nexport $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\npython3 - <<'PY'\nimport base64, json\npath = '/tmp/file.png'\npayload = {\n 'number': '6591234567',\n 'mediatype': 'image',\n 'mimetype': 'image/png',\n 'caption': 'Caption here',\n 'media': base64.b64encode(open(path, 'rb').read()).decode(),\n 'fileName': 'file.png',\n}\nopen('/tmp/wa_media_payload.json', 'w').write(json.dumps(payload))\nPY\ncurl -s --max-time 60 -X POST \"${EVOLUTION_API_URL}/message/sendMedia/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_media_payload.json\n```\n\nWhen sending QR tickets or passes, verify the crop/image contains the full QR code and the intended attendee name before sending; the QR screenshot is effectively the redeemable ticket.\n\n## Fetching Conversation Context\n\nBefore drafting or proposing any reply, always fetch recent chat history for context:\n\n```bash\npython3 /opt/data/scripts/wa_recent_messages.py \"6591234567@s.whatsapp.net\" 30\n```\n\nUse `limit=30` by default. For active group chats or ambiguous situations, fetch up to 50 messages. Read enough context to understand who is being discussed, current tone, previous commitments, and whether Vincent has already replied. The script returns newest messages first (descending order).\n\n## Draft Approvals — `jimmy-wa-drafts` sidecar\n\nWhen you need Vincent to approve a WhatsApp reply, POST the draft to the sidecar. It owns the entire approval flow: Telegram message with 4 inline buttons, Telegram WebApp for Edit/Context, sending to WhatsApp via Evolution API on Approve.\n\n### Posting a draft\n\nAlways include an `extras` object with the original WA context. The sidecar passes `extras` through verbatim if Vincent later clicks \"Context\" — without it, your redraft will be flying blind.\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\n \"contact\": \"120363351787895353@g.us\",\n \"contact_name\": \"Cycling group\",\n \"summary\": \"Karen sent a blessing over the bike ride; she's not joining\",\n \"draft_text\": \"Thanks Karen, will miss you both — praying for a good time too 🙏\",\n \"extras\": {\n \"group_jid\": \"120363351787895353@g.us\",\n \"sender_name\": \"Karen Tan\",\n \"sender_jid\": \"6597630553@s.whatsapp.net\",\n \"category\": \"prayer/blessing\",\n \"original_message\": \"Happy cycling, to those who can make it… Dn Jeff and I had made plans for this PM so we won't be able to join y'all. Praying the Lord's favor, protection and good-weather blessings be upon y'all and your route!\",\n \"extracted\": {\n \"what\": \"Cycling from East Coast Park to Changi Village, then dinner\",\n \"when\": \"Today, 4:00 PM\",\n \"where\": \"East Coast Park to Changi Village\",\n \"who\": \"Group members who voted yes for cycling/dinner or dinner\"\n },\n \"confidence\": \"high\",\n \"why_appropriate\": \"Multiple group messages are blessings/wishes; Vincent has not replied; tone matches grace-based norm.\",\n \"wa_message_id\": \"ABC123XYZ\"\n }\n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\nResponse: `{\"draft_id\":\"abc123def\",\"telegram_message_id\":42}`\n\nWhat to put in `extras` (use whatever makes sense for the message type):\n\n- **Always**: `original_message` (verbatim WA text), `sender_name`, `category` (e.g. `prayer`, `birthday`, `event`, `dm-reply`, `enquiry`)\n- **Group messages**: `group_jid`, `sender_jid`\n- **DMs**: `sender_jid`\n- **Strongly recommended (drives the rich Telegram render)**: `extracted` (dict of who/what/when/where), `confidence` (`high`/`medium`/`low`), `why_appropriate` (one-line reasoning shown to Vincent so he can decide)\n- **Optional**: `wa_message_id`, `tone_hint` (e.g. `warm-grace-based`), `sender_phone` (auto-derived from `sender_jid` if omitted)\n\nVincent receives a Telegram message in the `@jimmy_WA_bot` chat with 4 inline buttons. The sidecar handles every interaction; **you do not track replies, callbacks, or message IDs**. Just POST and move on.\n\n| Button | Behavior |\n|--------|----------|\n| **Approve** | Sidecar sends draft to WhatsApp |\n| **Edit** | Opens a Telegram WebApp with draft pre-filled; Vincent edits + sends; sidecar pushes to WhatsApp |\n| **Context** | Opens a Telegram WebApp with empty textarea + the original WA message visible; Vincent provides context. (When `HERMES_REDRAFT_WEBHOOK` is wired in the sidecar's env, this will POST back to Jimmy for a redraft. Until then, the context is just stored on the draft.) |\n| **Pass** | Sidecar deletes the draft |\n\n### JSON escaping when `draft_text` contains quotes\n\nSingle quotes (`It's`, `God's`) break shell interpolation. Write the JSON to a temp file first:\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\"contact\":\"6591234567@s.whatsapp.net\",\"contact_name\":\"John\",\"summary\":\"…\",\"draft_text\":\"It's nice to hear from you!\"}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\n### Networking\n\n`https://jimmy-draft.huper.asia` is the public HTTPS endpoint of the sidecar (TLS-terminated by Traefik on ubuntu-server). HTTPS is required because Hermes blocks plain-HTTP commands via tirith. The internal Docker DNS hostname http://jimmy-wa-v64mqe:8080 also works for non-shell calls but the agent is forbidden from using it.\n\n### Sidecar status & logs\n\n- Source: (auto-deploys on push)\n- Public URL: (`/health` for ping)\n- Container: `jimmy-wa-v64mqe.1.` on ubuntu-server\n- Bot: `@jimmy_WA_bot`\n\n## Handling a Redraft Trigger (`wa-redraft` webhook)\n\nWhen Vincent clicks the **Context** button on a draft, the sidecar:\n1. Edits the original Telegram message to \"🔄 Adding context for X… agent is redrafting.\" (removes the old buttons so Vincent can't approve the now-stale text)\n2. POSTs to your `wa-redraft` webhook (`http://localhost:8644/webhooks/wa-redraft`) with the original draft fields plus Vincent's `context` instructions and the `extras` you stored when you first created the draft.\n\nWhen you receive a `wa-redraft` activation, the prompt template will lay out the fields. Your job:\n\n1. Read Vincent's `context` carefully — that's the steering signal. The `previous_draft` is just history, not necessarily a starting point.\n2. Re-fetch recent WA history if the original context isn't enough (`python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`). Use `extras.group_jid` for groups or `contact` for DMs.\n3. Write a NEW draft that addresses Vincent's instructions while staying faithful to the original WA context (in `extras.original_message`, `extras.extracted`).\n4. POST the new draft back to the sidecar **with the same `draft_id`** so the existing draft is updated, not duplicated:\n\n```bash\ncat > /tmp/wa_redraft.json <\",\n \"contact\": \"\",\n \"contact_name\": \"\",\n \"summary\": \"\",\n \"draft_text\": \"\",\n \"extras\": \n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_redraft.json\n```\n\nThe sidecar will send a NEW Telegram message with 4 fresh buttons. Vincent can Approve, Edit, Context-again (loop), or Pass.\n\nDo **not** send to WhatsApp directly during a redraft — the approval flow is the sidecar's job.\n\n## Incoming Message Processing\n\n### Delivery Pipeline (current)\n1. Evolution API POSTs each `MESSAGES_UPSERT` webhook directly to Hermes at `http://localhost:8644/webhooks/whatsapp-incoming` (proxied via `jimmy.huper.asia/webhooks/whatsapp-incoming`)\n2. Hermes webhook route (`whatsapp-incoming`, dynamic subscription) validates and spawns an agent\n3. Agent loads `whatsapp-evolution` skill, follows the rules below\n4. Webhook subscription uses `deliver: log` because webhooks do not support `deliver: local`; the agent explicitly calls `send_message(target=\"telegram\", ...)` only for important/actionable/VIP/event/birthday/prayer/error items, OR posts a draft to the sidecar for approval items.\n\n**No more OpenClaw, no Python webhook server, no port 5001, no keepalive cron.** If messages aren't being processed, check: nginx proxy to port 8644, `hermes webhook list` for the subscription, Evolution API webhook URL setting.\n\n## Contact Rules & Escalation\n\nConfig: `memory/whatsapp-vip-contacts.json`\n\n### Priority Levels\n\n| Priority | Action | Example |\n|----------|--------|---------|\n| **VIP (call-always)** | **CALL Vincent immediately via Twilio without asking.** Do NOT ask for approval first. During quiet hours (1am-10am SGT) suppress call and use Telegram notification only, unless life-and-death emergency. Match VIPs by both phone JID (`659...@s.whatsapp.net`) and LID JID (`...@lid`) when present, because recent chats may arrive with `key.remoteJid` as an LID and the phone number only in `key.remoteJidAlt`. | Dn Jeff (+6597630553) |\n| **Family** | Always notify Vincent on Telegram | ps112 group, family members |\n| **Auto-reply** | Reply based on context file, notify Vincent | Repair shops |\n| **Normal** | Analyze intent, reply if appropriate, notify if important | Everyone else |\n| **Broadcast/Spam** | Ignore silently | Forwarded job posts, chain messages |\n\n### Quiet Hours\n- 1am-10am SGT (17:00-02:00 UTC): avoid phone calls, including for Dn Jeff / VIP call-always contacts\n- **Exception:** life-and-death emergencies always call regardless of quiet hours\n- During quiet hours, use Telegram notification only and mention that a call was suppressed due to quiet hours\n\n## Making a Call via Twilio\n\nCredentials: `/opt/data/secrets/.twilio-credentials-new` (`TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN`, `TWILIO_PHONE_NUMBER`). From: `+16614809287`, Vincent: `+6590054438`.\n\nPreferred script: `/opt/data/scripts/call_vincent.py`, now using the Twilio Python SDK baked into the container.\n\n```bash\npython3 /opt/data/scripts/call_vincent.py \"Hello Vincent, this is Jimmy. Urgent summary here.\"\n```\n\n## Action Rules\n\n### Default: DO NOT auto-reply directly, and do not spam main Telegram\nCollect intent and stay local unless something truly needs Vincent's immediate attention.\n\n**When you need Vincent to approve a reply, POST a draft to the sidecar (see \"Draft Approvals\" above).** The sidecar handles the Telegram notification with buttons, the WebApp for editing, and sending to WhatsApp on approve. Do **not** duplicate that draft or summary in the main Telegram chat.\n\nMain Telegram WhatsApp summaries are muted by default. Only use `send_message(target=\"telegram\", ...)` for urgent/important/actionable items that need Vincent now: VIP actionable messages/calls, missed/ringing family calls, emergencies, safety/medical/school-pickup issues, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors. Routine logistics, casual family pings, generic group events, links, standalone codes, normal confirmations, and suggested replies should be local-only or sidecar-only.\n\nDo NOT send WhatsApp replies directly unless Vincent explicitly approves them or the category is pre-approved.\n\n### Tone (when proposing or sending replies)\n- Casual, direct, no sign-offs (no \"Thanks & Regards\", no \"- Vincent\")\n- Ask questions instead of giving instructions\n- Keep it natural — like texting a friend\n- Christian and grace-based: warm, humble, encouraging, non-judgmental, peace-making\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- For prayer/birthday contexts, use gentle faith-filled language; avoid sounding preachy, performative, or overly formal\n\n### What to Do With Each Message\n\n**DMs (personal chats):**\n- Analyze intent\n- Notify Vincent on Telegram (via `send_message(target=\"telegram\", ...)`) only if important/actionable/time-sensitive, requires a decision, comes from a VIP contact\n- If you have a useful suggested reply, POST it as a draft to the sidecar instead of a Telegram notification — Vincent gets the buttons inline\n- Casual/social/non-actionable DMs → local log only, no Telegram notification, no draft\n- Do NOT reply unless Vincent says to\n- VIP contacts → **call Vincent immediately without asking**. Do NOT post a draft. Just call. During quiet hours (1am-10am SGT), suppress the call and send Telegram notification only, unless life-and-death emergency.\n\n**Group chats — events/meetups:**\n- If someone posts about an event, gathering, meetup, dinner, etc:\n 1. Extract: what, when, where, who\n 2. Notify Vincent on Telegram with details (no draft needed unless you also want to suggest an RSVP reply)\n 3. Ask: \"Want me to add this to Google Calendar and set a reminder?\"\n\n**Group chats — prayer requests / praying for others:**\n- If someone asks for prayer, shares a burden, illness, grief, struggle, thanksgiving, or asks the group to pray:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Identify who/what to pray for and any sensitive details\n 3. Draft a short, grace-based Christian reply/prayer\n 4. POST it to the sidecar with `summary` describing the request and `draft_text` being the prayer\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — birthday greetings:**\n- If you see birthday wishes for someone in a group:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Note who the birthday person is and relationship/context if inferable\n 3. Draft a warm, grace-based Christian birthday greeting\n 4. POST it to the sidecar\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — general:**\n- Casual banter, memes, forwards, broadcast spam → ignore silently\n- Religious content shares → ignore unless specifically addressed to Vincent\n\n### When to Notify Vincent (Telegram) vs Post a Draft\n\n**Use `send_message(target=\"telegram\", ...)` sparingly** for plain notifications only when the message needs Vincent's immediate attention: VIP actionable DMs/calls, emergencies, school pickup/child logistics requiring action now, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors.\n\n**Use the sidecar (POST `/drafts`)** whenever you have a suggested reply for Vincent to approve: prayer responses, birthday greetings, DM replies, group RSVPs. Do not also send the same summary to the main Telegram chat.\n\nDo **not** notify or draft for casual/social/non-actionable DMs, routine logistics, read receipts/status updates, religious shares not addressed to Vincent, memes, forwarded/broadcast/spam, generic group events, standalone codes, normal confirmations, or general group chatter. Family messages are not automatically main-chat alerts unless urgent, safety-related, school/pickup critical, or explicitly requiring Vincent now.\n\n## Intent Categories\n\nWhen analyzing incoming messages, classify as:\n\n1. **Urgent** — time-sensitive, emergencies → call Vincent outside quiet hours; during quiet hours use Telegram only unless life-and-death\n2. **Important** — decisions, money, appointments, anything consequential → call Vincent outside quiet hours; during quiet hours notify on Telegram only unless life-threatening\n3. **Request** — asking Vincent to do something → **notify Vincent** (Telegram) or post a draft if you have a safe reply\n4. **Enquiry** — asking for info, pricing, availability → auto-reply if context available, otherwise post a draft\n5. **Confirmation** — confirming plans, bookings → acknowledge and notify Vincent\n6. **Casual/Social** — greetings, memes, forwards → usually ignore or react\n7. **Broadcast/Spam** — forwarded job posts, chain messages → ignore\n\n## Finding Individual Contacts and JIDs\n\nWhen Vincent asks to monitor or prioritize a specific WhatsApp contact and only gives a name:\n\n1. Query contacts and chats, then search both display names and serialized message metadata:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n python3 - <<'PY'\n import os, requests, json\n base=os.environ['EVOLUTION_API_URL']; inst=os.environ['EVOLUTION_INSTANCE']; key=os.environ['EVOLUTION_API_KEY']\n headers={'apikey':key,'Content-Type':'application/json'}\n terms=['whye','keong'] # replace with lowercase search terms\n for ep in ['findContacts','findChats']:\n r=requests.post(f'{base}/chat/{ep}/{inst}', headers=headers, json={}, timeout=30)\n print('endpoint', ep, 'status', r.status_code)\n for obj in (r.json() if r.ok else []):\n if any(t in json.dumps(obj, ensure_ascii=False).lower() for t in terms):\n print(json.dumps(obj, ensure_ascii=False)[:1000])\n PY\n ```\n2. If `findChats` shows `remoteJid` ending in `@lid`, inspect `lastMessage.key.remoteJidAlt` for the phone JID. Store both in `memory/whatsapp-vip-contacts.json` when configuring VIP/call rules.\n3. Verify the number exists with `/chat/whatsappNumbers/{instance}` before finalizing.\n4. Validate the JSON config after editing: `python3 -m json.tool /opt/data/agent-state/memory/whatsapp-vip-contacts.json >/tmp/wa_vip_validated.json`.\n\n## Finding Group JIDs\n\nThe `group/fetchGroups` endpoint returns 404 on some Evolution API instances (including ours). To find a group JID when you don't have it:\n\n1. **Use `chat/findMessages` with an empty `where` filter** to pull recent messages across all chats, then extract unique `remoteJid` values containing `@g.us`.\n2. **Match by content**: Fetch messages from each group JID using `where.key.remoteJid` and match sender names or message text to identify the target group.\n3. **Known group JIDs** (store in `whatsapp-vip-contacts.json` as you discover them):\n - ps112 family group: `6591887488-1464061170@g.us`\n - Overcomers group: `6591444911-1614821287@g.us`\n - 神的华美 group: `6596756578-1405873840@g.us` (content-sharing, casual/broadcast, local log only)\n - Overcomers/TCians group: `120363230236878544@g.us` (church fellowship, PJ sends weekly attendance reports, routine attendance = local log only)\n\nExample search:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"6591444911-1614821287@g.us\"}}, \"limit\": 10, \"order\": \"desc\"}'\n```\n\n## Processing Image/Media Messages\n\nWhen an incoming WhatsApp message is an image (imageMessage), video, or other media with no text or only a caption:\n\n1. **Fetch conversation context** as usual with `wa_recent_messages.py`\n2. **Download the media** using the Evolution API:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"message\": {\"key\": {\"id\": \"MESSAGE_ID\", \"fromMe\": false, \"remoteJid\": \"GROUP_OR_CHAT_JID\"}}, \"convertToMp4\": false}' \\\n -o /tmp/wa_media_response.json\n ```\n Then decode: `python3 -c \"import json,base64; d=json.load(open('/tmp/wa_media_response.json')); open('/tmp/wa_media.jpg','wb').write(base64.b64decode(d['base64']))\"`\n3. **Analyze the image** using `vision_analyze` tool with `/tmp/wa_media.jpg` path and a context-aware question\n4. **Fallback if vision is unavailable**: Use `tesseract /tmp/wa_media.jpg stdout` for OCR, combined with pixel analysis (PIL/numpy) and conversation context to infer what the image contains\n5. **Album messages** (`albumMessage`): These bundle multiple images. Each image arrives as a separate `imageMessage` with a `messageAssociation.parentMessageKey` linking them to the album. Download each image separately. The `expectedImageCount` field tells how many images to expect\n6. **messageAssociation**: When images have `parentMessageKey` with `fromMe: true`, they are replying to a message Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an incoming WhatsApp message is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification when possible:\n\n1. **Fetch conversation context first**, especially in groups:\n ```bash\n python3 /opt/data/scripts/wa_recent_messages.py \"\" 50\n ```\n2. **Download the audio** with Evolution API, saving the response to a file:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_audio_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"GROUP_OR_CHAT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_audio_req.json -o /tmp/wa_audio_response.json\n python3 - <<'PY'\n import json, base64\n d=json.load(open('/tmp/wa_audio_response.json'))\n b=d.get('base64') or d.get('data',{}).get('base64')\n if ',' in b[:100]: b=b.split(',',1)[1]\n open('/tmp/wa_audio.ogg','wb').write(base64.b64decode(b))\n PY\n ```\n3. **Convert OGG/Opus to WAV**:\n ```bash\n ffmpeg -y -i /tmp/wa_audio.ogg -ar 16000 -ac 1 /tmp/wa_audio.wav\n ```\n4. **Transcribe fallback when Whisper is unavailable**: the container may not have `whisper`, `faster-whisper`, `torch`, or `speech_recognition` installed globally. A reliable lightweight fallback is a temp venv with SpeechRecognition and Google web speech:\n ```bash\n python3 -m venv /tmp/srvenv\n /tmp/srvenv/bin/pip install --quiet SpeechRecognition\n /tmp/srvenv/bin/python - <<'PY'\n import speech_recognition as sr\n r=sr.Recognizer()\n with sr.AudioFile('/tmp/wa_audio.wav') as source:\n audio=r.record(source)\n print(r.recognize_google(audio, language='en-SG'))\n PY\n ```\n5. **Check adjacent voice notes from the same sender**. WhatsApp voice notes are often sent in quick sequential chunks. Download and transcribe nearby `audioMessage` records from the same sender before deciding whether to notify Vincent.\n6. Use the transcript plus recent context to classify the message. For group meetup planning, extract what/when/where/who and notify Vincent if actionable, including transcript confidence.\n\n### Video Messages\n\nWhen an incoming WhatsApp message is `videoMessage` (with or without a caption):\n\n1. **Fetch conversation context** first, as with any media message: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n2. **Download the video** using Evolution API (same pattern as images/audio — save response to file, then decode):\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_video_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"CONTACT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_video_req.json -o /tmp/wa_video_response.json\n python3 - <<'PY'\n import json, base64\n d = json.load(open('/tmp/wa_video_response.json'))\n b = d.get('base64') or d.get('data', {}).get('base64', '')\n if ',' in b[:100]: b = b.split(',', 1)[1]\n open('/tmp/wa_video.mp4', 'wb').write(base64.b64decode(b))\n print(f\"Video saved: {len(base64.b64decode(b))} bytes\")\n PY\n ```\n3. **Extract representative frames** with ffmpeg (do NOT try to analyze video directly — vision tools work on images):\n ```bash\n # Frame at start\n ffmpeg -y -ss 0 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame0.jpg\n # Frame at ~1/3 of the way (good for short videos)\n ffmpeg -y -ss 3 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame3s.jpg\n # If video is longer (>15s), also extract at midpoint\n # ffprobe -v quiet -print_format json -show_format /tmp/wa_video.mp4 # get duration\n # ffmpeg -y -ss $HALF_DURATION -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame_mid.jpg\n ```\n4. **Analyze the frames** with `vision_analyze`. Ask specific questions about content: what objects/people/text are visible, what type of video this appears to be (screen recording, personal clip, forwarded meme, etc.), and any text or identifying details.\n5. **Use the caption** from `videoMessage.caption` if present (can be empty/null). Combine frame analysis + caption + conversation context to classify the message intent.\n6. **Transcription of audio in video** is generally not feasible with available tools. If the video likely contains important spoken content (e.g. voice memo style), note this limitation and flag it to Vincent.\n\n**Key points:**\n- Always use `-o` flag and file-based decode; videos can be 1-10MB and base64 responses will be even larger\n- `--max-time 25` on curl is essential for video downloads\n- Short casual clips (showing a key fob, a parking spot, quick view of surroundings) are usually casual/social — classify accordingly and don't over-escalate\n- If `vision_analyze` fails (400 on models that don't support images, or 429 rate limit), fall back to conversation context inference alone\n\n### Poll Messages\nWhen `pollUpdateMessage` appears in chat history (someone voted), the message contains a `pollCreationMessageKey` with the original poll's message `id`. To get the poll question and options, query:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"id\": \"POLL_MESSAGE_ID\", \"remoteJid\": \"GROUP_JID\"}}, \"limit\": 1}'\n```\nThe response will have `pollCreationMessageV3` with `name` (the poll question) and `options` array. Always look up the poll when votes or poll references appear, since the poll text contains the event details.\n\n### `associatedChildMessage` (Motion Photos / Companion Media)\nWhatsApp motion photos and companion media arrive as `associatedChildMessage` type. These are short videos or supplementary images attached to a parent message via `messageAssociation.parentMessageKey`. The structure is:\n\n```\nmessage: {\n messageContextInfo: {\n messageAssociation: {\n associationType: 12,\n parentMessageKey: { id, fromMe, remoteJid }\n }\n },\n associatedChildMessage: {\n message: {\n videoMessage: { ... } // or imageMessage: { ... }\n }\n }\n}\n```\n\nKey points:\n- The `parentMessageKey.id` links back to the main message (the image/video they accompany)\n- When `parentMessageKey.fromMe: true`, the companion media is replying to a message Vincent sent\n- Captions may appear on the child's `videoMessage.caption` or `imageMessage.caption` inside the nested message, NOT at the top level\n- These often contain motion-photo short videos (3 seconds) or album companion images\n- Treat them as media attachments to the parent; don't process them as standalone messages requiring independent intent analysis\n\n### Sequential Image Sets (Non-Album)\nMultiple images sent rapidly by the same sender at similar timestamps may NOT be wrapped in an `albumMessage`. They appear as separate `imageMessage` entries. Always check for adjacent images from the same sender and download+OCR all of them together, as they often form a coherent multi-page set (e.g., route info, comparison tables, maps). Missing one can lose important context.\n\n### Key pitfalls\n- The Evolution API `getBase64FromMediaMessage` endpoint returns the full base64 payload (can be 100KB+ for images). Use `-o` flag to save to file, then parse with Python. Do NOT try to pipe it directly\n- `curl` with `--max-time 25` is essential; without it, large media responses can hang\n- The `albumMessage` type itself has no `fromMe` field; check each individual `imageMessage` instead\n- Image thumbnails in the webhook payload are truncated; always download the full image for analysis\n- **WhatsApp media URLs expire**: CDN download links (`mmg.whatsapp.net`) expire after roughly 24-48 hours. When `getBase64FromMediaMessage` returns a 400/404 error like \"Failed to fetch stream from https://mmg.whatsapp.net/...\", the media is no longer retrievable via the API. This means you CANNOT download adjacent/older voice notes, images, or documents for context if they're more than a day or two old. Fall back to conversation text context only. Fresh/incoming messages are almost always still downloadable.\n- **Voice note transcription reliability**: Short voice notes (<15s), especially in Singlish/Chinese-English mix common in Singapore contacts, often produce unreliable transcriptions via SpeechRecognition+Google. Always flag low-confidence transcriptions to Vincent rather than acting on potentially garbled text.\n- **Vision API failures**: `vision_analyze` can fail with 429 (rate limit) **or 400** (model doesn't support image input, e.g. glm-5.1 via ollama-cloud). Always have a fallback plan. When vision is unavailable, try tesseract OCR, then pixel analysis (PIL/numpy for brightness zones, text region detection via row-level dark-pixel density)\n- **OCR can fail on decorative/complex images**: Obituaries, memorial notices, religious flyers with ornate fonts, or dark-background images often defeat tesseract even with aggressive preprocessing (upscaling, thresholding, inverting, PSM mode variations). When OCR returns gibberish or empty, fall back to conversation context inference: use preceding/surrounding messages to determine what the image likely contains, then draft an appropriate response based on that context\n- **OCR on devotional/dark-background images**: These often defeat tesseract on full-image scans. Use a multi-strategy approach: (1) run tesseract on the full image first, (2) if text is sparse, crop into regions (top 15%, middle, bottom 15%), upscale 2-3x, enhance contrast (2-3x), and try both normal and inverted (`ImageOps.invert`) versions with `--psm 6`, (3) combine all partial OCR results. For Exodus/verse images, the title (e.g. \"WHEN MOSES' HANDS WERE\") is often in decorative font that OCR misses, while the verse body text is usually readable.\n- **Prioritize conversation context**: When processing group images, always fetch recent context FIRST (before attempting vision/OCR). If image analysis fails, the conversation history often makes the image's purpose clear enough to draft a meaningful reply\n- **DM conversation context is one-sided (fromMe only)**: The Evolution API `findMessages` endpoint often returns only Vincent's sent messages (`fromMe: true`) for DM chats, not the contact's incoming messages. This means `wa_recent_messages.py` can show only one side of the conversation. When processing a DM webhook, do NOT rely solely on fetched history for the contact's side. The webhook payload already contains the incoming message text. Use what the webhook gives you plus Vincent's side from the API. If you need the contact's recent messages, check if `fromMe: false` records exist; if none are returned, the API simply doesn't have them and you must work with what you have (webhook message + Vincent's replies for context).\n- **`findMessages` ignores `key.fromMe` filter**: Passing `\"key.fromMe\": false` in the `where` clause still returns all messages (including `fromMe: true`). To find messages from a specific sender, fetch results and filter client-side by checking `fromMe` in Python.\n- **`findMessages` caps at ~50 records per request**: Even with `limit: 200` or `limit: 500`, the API returns at most 50 records. When one party sends many consecutive messages, the other party's messages may not appear in the first 50 results. Use pagination or fetch multiple pages if you need older messages.\n- **Video frame extraction**: Use `ffmpeg -y -ss N -i video.mp4 -frames:v 1 -update 1 output.jpg` (the `-update 1` flag is needed; without it ffmpeg treats the output as an image sequence and fails). Extract frames at multiple timestamps (start, ~1/3, midpoint for longer videos) since a single frame may miss the key content.\n- **ffmpeg \"Invalid color space\" failure on WhatsApp videos**: Some WhatsApp H.264 videos have `color_space=reserved` and `color_primaries=reserved` metadata that ffmpeg cannot decode at all. Every codec/format/filter combination fails with \"Invalid color space\" / return code 234. This is NOT fixable with `-vf setparams`, `-color_trc`, `-colorspace`, `-color_primaries`, or `-vf format=` flags because the error occurs at the decoder level before any filter runs. **Do not waste time trying ffmpeg workarounds.** When this happens, fall back to: (1) conversation context inference (who sent it, what thread it's in, adjacent messages), (2) the jpegThumbnail embedded in the videoMessage (small but gives a rough idea; base64-encoded JPEG, typically 600-900 bytes, use `vision_analyze` on it if available), or (3) if all analysis fails, just classify based on surrounding text context.\n- **`wa_recent_messages.py` JSON output breaks `json.loads()`**: The script's stdout contains WhatsApp message content with control characters (emoji, CJK, deviceListMetadata binary blobs) that cause `json.loads()` to fail with `Invalid control character` or `Expecting ',' delimiter` errors. Even `strict=False` does not reliably fix it. **Reliable approach**: use `curl -o /tmp/wa_msgs.json` to save the Evolution API response directly to a file (the file write handles binary safely), then parse with `json.load(open('/tmp/wa_msgs.json'))` in Python. Example:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 15 -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"CONTACT_JID\"}}, \"limit\": 20}' -o /tmp/wa_msgs.json\n ```\n Then in Python: `data = json.load(open('/tmp/wa_msgs.json')); records = data['messages']['records']`\n\n## API Reference\n\nFor full API endpoints (media, contacts, location, buttons) — see [references/api.md](references/api.md).\n\n[Skill directory: /opt/data/skills/personal/whatsapp-evolution]\nResolve any relative paths in this skill (e.g. `scripts/foo.js`, `templates/config.yaml`) against that directory, then run them with the terminal tool using the absolute path.\n\n[This skill has supporting files:]\n- references/api.md -> /opt/data/skills/personal/whatsapp-evolution/references/api.md\n- scripts/process_queue.py -> /opt/data/skills/personal/whatsapp-evolution/scripts/process_queue.py\n- scripts/send_text.sh -> /opt/data/skills/personal/whatsapp-evolution/scripts/send_text.sh\n\nLoad any of these with skill_view(name=\"personal/whatsapp-evolution\", file_path=\"\"), or run scripts directly by absolute path (e.g. `node /opt/data/skills/personal/whatsapp-evolution/scripts/foo.js`).\n\nThe user has provided the following instruction alongside the skill invocation: Incoming WhatsApp webhook received.\n\nEvent: messages.upsert\nInstance: n8n-WA\nSender: Hui Xian\nSender JID: 120363418226875471@g.us\nFrom me: False\nMessage type: reactionMessage\nMessage: {\n \"event\": \"messages.upsert\",\n \"instance\": \"n8n-WA\",\n \"data\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": false,\n \"id\": \"3A984E632D40174CA880\",\n \"participant\": \"176888278429759@lid\",\n \"participantAlt\": \"6596725405@s.whatsapp.net\",\n \"addressingMode\": \"lid\"\n },\n \"pushName\": \"Hui Xian\",\n \"status\": \"DELIVERY_ACK\",\n \"message\": {\n \"reactionMessage\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": false,\n \"id\": \"ACAF388AF06A2F0E1C1777A7917ED565\",\n \"participant\": \"207726193995780@lid\"\n },\n \"text\": \"\\u2764\\ufe0f\",\n \"senderTimestampMs\": {\n \"low\": -1593526297,\n \"high\": 414,\n \"unsigned\": false\n }\n }\n },\n \"messageType\": \"reactionMessage\",\n \"messageTimestamp\": 1780817901,\n \"instanceId\": \"b2bfaa30-d5f1-4b22-b979-791892619414\",\n \"source\": \"ios\"\n },\n \"destination\": \"https://jimmy.huper.asia/webhooks/whatsapp-incoming\",\n \"date_time\": \"2026-06-07T04:38:22.119Z\",\n \"sender\": \"6590054438@s.whatsapp.net\",\n \"server_url\": \"https://evolution.huper.asia\",\n \"apikey\": \"5E4C532223A0-460B-A749-16C584F56E74\"\n}\n\nPrimary goal: REDUCE NOISE. Stay silent in the main Telegram chat unless the message truly needs Vincent's immediate attention. Approval-worthy reply drafts should go to the jimmy-wa-drafts sidecar, NOT to the main Telegram chat.\n\nFollow the whatsapp-evolution skill and /opt/data/agent-state/memory/whatsapp-vip-contacts.json, with these stricter notification rules:\n\nABSOLUTE SILENCE RULES:\n- If fromMe is true: ignore. Do NOT call send_message. Final response local log only.\n- Ignore status/update/read-receipt events or payloads with no real message content.\n- Do NOT send main Telegram summaries for ordinary DMs, routine logistics, casual family messages, generic group events, group chatter, forwards, memes, devotional/religious shares, links, standalone codes, normal confirmations, or suggested replies.\n- Do NOT ask \"Want me to add this to Calendar?\" for WhatsApp group events unless Vincent is explicitly invited, named, responsible, or the event is time-sensitive and clearly relevant to him today.\n- Do NOT surface family messages just because they are family. Only surface if urgent, safety-related, school/pickup critical, money/legal/medical, or explicitly requires Vincent now.\n\nSIDEcar RULES:\n- If a reply may be useful but it is not urgent, POST it to https://jimmy-draft.huper.asia/drafts using the whatsapp-evolution Draft Approvals flow. Do NOT duplicate that draft/summary in the main Telegram chat.\n- Prayer and birthday replies should go to the sidecar when appropriate, not the main Telegram chat, unless the underlying situation is urgent/sensitive and Vincent needs immediate awareness.\n- Before drafting any reply or approval suggestion, fetch recent context: python3 /opt/data/scripts/wa_recent_messages.py \"120363418226875471@g.us\" 30. For active/ambiguous groups use 50.\n\nMAIN TELEGRAM ALERTS ARE ALLOWED ONLY FOR:\n1. VIP contacts marked always_call or always_notify when the message is important/actionable, or any VIP call event.\n2. Calls from family/VIP that appear missed/ringing and likely need immediate attention. Avoid repeated alerts for accepted/terminated duplicate call state updates.\n3. Emergencies, safety issues, medical issues, school pickup/child logistics requiring action now, legal/money/account/security matters, or time-sensitive commitments in the next few hours.\n4. Messages explicitly asking Vincent to call back, decide now, approve payment, provide OTP/code for an active task, or respond urgently.\n5. Pipeline errors or webhook processing failures.\n\nCALL RULES:\n- VIP contacts (e.g. Dn Jeff, Ng Whye Keong): call Vincent immediately outside quiet hours when actionably important or on ringing/missed call. During 1am-10am SGT, Telegram only unless life-and-death emergency.\n- Do not call for casual VIP confirmations or non-actionable courtesy messages.\n\nTONE/RULES FOR DRAFTS:\n- Proposed replies must be Christian and grace-based where appropriate: warm, humble, encouraging, non-judgmental, peace-making, not preachy.\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- NEVER auto-reply directly on WhatsApp unless Vincent explicitly approves or future config marks that exact category pre-approved.\n\nDelivery rule:\n- For allowed main alerts only: call send_message(target=\"telegram\", message=...). Keep it concise.\n- For non-urgent reply suggestions: use the sidecar only.\n- For everything else: DO NOT call send_message. Keep final response as local-only log.\n", "timestamp": 1780817935.095733, "anchor": true}, {"id": 104156, "role": "assistant", "content": "This is a **reaction message** (❤️) from Hui Xian in group `120363418226875471@g.us`, reacting to another member's message. No text content, no actionable intent, not from a VIP contact, not urgent. Per the noise reduction rules, reactions are casual/social activity with no need for notification or draft.\n\n**Local log only. No Telegram alert, no sidecar draft.**", "timestamp": 1780817935.194004}], "bookend_end": [], "messages_before": 0, "messages_after": 2}, {"session_id": "20260607_073840_dd083519", "when": "June 07, 2026 at 07:38 AM", "source": "webhook", "model": "glm-5.1", "title": null, "matched_role": "user", "match_message_id": 104152, "snippet": "...true`, they are replying to a >>>message<<< Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an >>>incoming<<< WhatsApp >>>message<<< is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification...", "bookend_start": [], "messages": [{"id": 104152, "role": "user", "content": "[IMPORTANT: The user has invoked the \"whatsapp-evolution\" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]\n\n---\nname: whatsapp-evolution\ndescription: WhatsApp messaging via Evolution API — send, receive, analyze, and auto-reply. Use when processing an incoming WhatsApp message delivered via the webhook, replying on Vincent's behalf, liaising with contacts, sending messages, or handling WhatsApp-related tasks. Covers intent analysis, auto-reply rules, escalation (VIP → phone call), conversation context fetching, and draft approvals via the jimmy-wa-drafts sidecar.\n---\n\n# WhatsApp via Evolution API\n\n## Credentials\nStored in `/opt/data/secrets/.whatsapp-evolution-credentials`:\n- `EVOLUTION_API_URL`, `EVOLUTION_API_KEY`, `EVOLUTION_INSTANCE`\n\nLoad with: `export $(cat .whatsapp-evolution-credentials | xargs)`\n\n## Sending Messages (direct, no approval)\n\n```bash\nexport $(cat .whatsapp-evolution-credentials | xargs)\ncurl -s -X POST \"${EVOLUTION_API_URL}/message/sendText/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"number\": \"6591234567\", \"text\": \"Hello!\"}'\n```\n\n- Phone format: country code, no `+` (e.g. `6590054438`)\n- Groups: use JID `{groupid}@g.us`\n- For media/location/contacts — see [references/api.md](references/api.md)\n\nUse direct send only when Vincent has explicitly commanded/approved the exact send, or for pre-approved auto-replies/follow-ups. For any reply that needs Vincent's review, use the sidecar (next section).\n\n### Direct replies when Vincent names an existing WhatsApp thread\n\nIf Vincent says to “reply to X” or “confirm with X” and has clearly approved the action, but the contact JID/number is not in the current prompt:\n\n1. Search recent session history first (`session_search`) for the contact name, clinic/business name, or appointment details. Prior webhook summaries often contain the exact `remoteJid`, alternate phone JID, sender name, and context.\n2. If session history identifies the WhatsApp phone JID, send directly only if Vincent’s instruction is explicit enough to act. Keep the text concise and natural.\n3. If only a LID JID is found, look for `remoteJidAlt`/alternate phone JID in the session summary or fetch contacts/chats before sending.\n4. Do not invent phone numbers from web search when replying to an existing WhatsApp conversation. Use web search only as supporting context, not as the destination, unless Vincent explicitly asks to contact a public number.\n\n\n### Sending trip/camp details from retrieved emails\n\nWhen Vincent explicitly asks to send extracted school trip/camp details to a child or family member, send directly. Keep it practical and grounded:\n- Include dates, reporting/pick-up times, flights, venue/itinerary highlights, deadlines, forms, and safety/medication notes when found.\n- If an official packing list is missing, say that clearly. Do not invent one. You may add a short “based on itinerary/weather, prepare…” list, clearly labeled as practical guidance rather than official school instructions.\n- Avoid long explanations about the search process in WhatsApp. Send the useful details.\n\n### Sending a local image/file as media\n\nFor user-approved media sends, base64-encode the local file and call `sendMedia`:\n\n```bash\nexport $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\npython3 - <<'PY'\nimport base64, json\npath = '/tmp/file.png'\npayload = {\n 'number': '6591234567',\n 'mediatype': 'image',\n 'mimetype': 'image/png',\n 'caption': 'Caption here',\n 'media': base64.b64encode(open(path, 'rb').read()).decode(),\n 'fileName': 'file.png',\n}\nopen('/tmp/wa_media_payload.json', 'w').write(json.dumps(payload))\nPY\ncurl -s --max-time 60 -X POST \"${EVOLUTION_API_URL}/message/sendMedia/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_media_payload.json\n```\n\nWhen sending QR tickets or passes, verify the crop/image contains the full QR code and the intended attendee name before sending; the QR screenshot is effectively the redeemable ticket.\n\n## Fetching Conversation Context\n\nBefore drafting or proposing any reply, always fetch recent chat history for context:\n\n```bash\npython3 /opt/data/scripts/wa_recent_messages.py \"6591234567@s.whatsapp.net\" 30\n```\n\nUse `limit=30` by default. For active group chats or ambiguous situations, fetch up to 50 messages. Read enough context to understand who is being discussed, current tone, previous commitments, and whether Vincent has already replied. The script returns newest messages first (descending order).\n\n## Draft Approvals — `jimmy-wa-drafts` sidecar\n\nWhen you need Vincent to approve a WhatsApp reply, POST the draft to the sidecar. It owns the entire approval flow: Telegram message with 4 inline buttons, Telegram WebApp for Edit/Context, sending to WhatsApp via Evolution API on Approve.\n\n### Posting a draft\n\nAlways include an `extras` object with the original WA context. The sidecar passes `extras` through verbatim if Vincent later clicks \"Context\" — without it, your redraft will be flying blind.\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\n \"contact\": \"120363351787895353@g.us\",\n \"contact_name\": \"Cycling group\",\n \"summary\": \"Karen sent a blessing over the bike ride; she's not joining\",\n \"draft_text\": \"Thanks Karen, will miss you both — praying for a good time too 🙏\",\n \"extras\": {\n \"group_jid\": \"120363351787895353@g.us\",\n \"sender_name\": \"Karen Tan\",\n \"sender_jid\": \"6597630553@s.whatsapp.net\",\n \"category\": \"prayer/blessing\",\n \"original_message\": \"Happy cycling, to those who can make it… Dn Jeff and I had made plans for this PM so we won't be able to join y'all. Praying the Lord's favor, protection and good-weather blessings be upon y'all and your route!\",\n \"extracted\": {\n \"what\": \"Cycling from East Coast Park to Changi Village, then dinner\",\n \"when\": \"Today, 4:00 PM\",\n \"where\": \"East Coast Park to Changi Village\",\n \"who\": \"Group members who voted yes for cycling/dinner or dinner\"\n },\n \"confidence\": \"high\",\n \"why_appropriate\": \"Multiple group messages are blessings/wishes; Vincent has not replied; tone matches grace-based norm.\",\n \"wa_message_id\": \"ABC123XYZ\"\n }\n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\nResponse: `{\"draft_id\":\"abc123def\",\"telegram_message_id\":42}`\n\nWhat to put in `extras` (use whatever makes sense for the message type):\n\n- **Always**: `original_message` (verbatim WA text), `sender_name`, `category` (e.g. `prayer`, `birthday`, `event`, `dm-reply`, `enquiry`)\n- **Group messages**: `group_jid`, `sender_jid`\n- **DMs**: `sender_jid`\n- **Strongly recommended (drives the rich Telegram render)**: `extracted` (dict of who/what/when/where), `confidence` (`high`/`medium`/`low`), `why_appropriate` (one-line reasoning shown to Vincent so he can decide)\n- **Optional**: `wa_message_id`, `tone_hint` (e.g. `warm-grace-based`), `sender_phone` (auto-derived from `sender_jid` if omitted)\n\nVincent receives a Telegram message in the `@jimmy_WA_bot` chat with 4 inline buttons. The sidecar handles every interaction; **you do not track replies, callbacks, or message IDs**. Just POST and move on.\n\n| Button | Behavior |\n|--------|----------|\n| **Approve** | Sidecar sends draft to WhatsApp |\n| **Edit** | Opens a Telegram WebApp with draft pre-filled; Vincent edits + sends; sidecar pushes to WhatsApp |\n| **Context** | Opens a Telegram WebApp with empty textarea + the original WA message visible; Vincent provides context. (When `HERMES_REDRAFT_WEBHOOK` is wired in the sidecar's env, this will POST back to Jimmy for a redraft. Until then, the context is just stored on the draft.) |\n| **Pass** | Sidecar deletes the draft |\n\n### JSON escaping when `draft_text` contains quotes\n\nSingle quotes (`It's`, `God's`) break shell interpolation. Write the JSON to a temp file first:\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\"contact\":\"6591234567@s.whatsapp.net\",\"contact_name\":\"John\",\"summary\":\"…\",\"draft_text\":\"It's nice to hear from you!\"}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\n### Networking\n\n`https://jimmy-draft.huper.asia` is the public HTTPS endpoint of the sidecar (TLS-terminated by Traefik on ubuntu-server). HTTPS is required because Hermes blocks plain-HTTP commands via tirith. The internal Docker DNS hostname http://jimmy-wa-v64mqe:8080 also works for non-shell calls but the agent is forbidden from using it.\n\n### Sidecar status & logs\n\n- Source: (auto-deploys on push)\n- Public URL: (`/health` for ping)\n- Container: `jimmy-wa-v64mqe.1.` on ubuntu-server\n- Bot: `@jimmy_WA_bot`\n\n## Handling a Redraft Trigger (`wa-redraft` webhook)\n\nWhen Vincent clicks the **Context** button on a draft, the sidecar:\n1. Edits the original Telegram message to \"🔄 Adding context for X… agent is redrafting.\" (removes the old buttons so Vincent can't approve the now-stale text)\n2. POSTs to your `wa-redraft` webhook (`http://localhost:8644/webhooks/wa-redraft`) with the original draft fields plus Vincent's `context` instructions and the `extras` you stored when you first created the draft.\n\nWhen you receive a `wa-redraft` activation, the prompt template will lay out the fields. Your job:\n\n1. Read Vincent's `context` carefully — that's the steering signal. The `previous_draft` is just history, not necessarily a starting point.\n2. Re-fetch recent WA history if the original context isn't enough (`python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`). Use `extras.group_jid` for groups or `contact` for DMs.\n3. Write a NEW draft that addresses Vincent's instructions while staying faithful to the original WA context (in `extras.original_message`, `extras.extracted`).\n4. POST the new draft back to the sidecar **with the same `draft_id`** so the existing draft is updated, not duplicated:\n\n```bash\ncat > /tmp/wa_redraft.json <\",\n \"contact\": \"\",\n \"contact_name\": \"\",\n \"summary\": \"\",\n \"draft_text\": \"\",\n \"extras\": \n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_redraft.json\n```\n\nThe sidecar will send a NEW Telegram message with 4 fresh buttons. Vincent can Approve, Edit, Context-again (loop), or Pass.\n\nDo **not** send to WhatsApp directly during a redraft — the approval flow is the sidecar's job.\n\n## Incoming Message Processing\n\n### Delivery Pipeline (current)\n1. Evolution API POSTs each `MESSAGES_UPSERT` webhook directly to Hermes at `http://localhost:8644/webhooks/whatsapp-incoming` (proxied via `jimmy.huper.asia/webhooks/whatsapp-incoming`)\n2. Hermes webhook route (`whatsapp-incoming`, dynamic subscription) validates and spawns an agent\n3. Agent loads `whatsapp-evolution` skill, follows the rules below\n4. Webhook subscription uses `deliver: log` because webhooks do not support `deliver: local`; the agent explicitly calls `send_message(target=\"telegram\", ...)` only for important/actionable/VIP/event/birthday/prayer/error items, OR posts a draft to the sidecar for approval items.\n\n**No more OpenClaw, no Python webhook server, no port 5001, no keepalive cron.** If messages aren't being processed, check: nginx proxy to port 8644, `hermes webhook list` for the subscription, Evolution API webhook URL setting.\n\n## Contact Rules & Escalation\n\nConfig: `memory/whatsapp-vip-contacts.json`\n\n### Priority Levels\n\n| Priority | Action | Example |\n|----------|--------|---------|\n| **VIP (call-always)** | **CALL Vincent immediately via Twilio without asking.** Do NOT ask for approval first. During quiet hours (1am-10am SGT) suppress call and use Telegram notification only, unless life-and-death emergency. Match VIPs by both phone JID (`659...@s.whatsapp.net`) and LID JID (`...@lid`) when present, because recent chats may arrive with `key.remoteJid` as an LID and the phone number only in `key.remoteJidAlt`. | Dn Jeff (+6597630553) |\n| **Family** | Always notify Vincent on Telegram | ps112 group, family members |\n| **Auto-reply** | Reply based on context file, notify Vincent | Repair shops |\n| **Normal** | Analyze intent, reply if appropriate, notify if important | Everyone else |\n| **Broadcast/Spam** | Ignore silently | Forwarded job posts, chain messages |\n\n### Quiet Hours\n- 1am-10am SGT (17:00-02:00 UTC): avoid phone calls, including for Dn Jeff / VIP call-always contacts\n- **Exception:** life-and-death emergencies always call regardless of quiet hours\n- During quiet hours, use Telegram notification only and mention that a call was suppressed due to quiet hours\n\n## Making a Call via Twilio\n\nCredentials: `/opt/data/secrets/.twilio-credentials-new` (`TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN`, `TWILIO_PHONE_NUMBER`). From: `+16614809287`, Vincent: `+6590054438`.\n\nPreferred script: `/opt/data/scripts/call_vincent.py`, now using the Twilio Python SDK baked into the container.\n\n```bash\npython3 /opt/data/scripts/call_vincent.py \"Hello Vincent, this is Jimmy. Urgent summary here.\"\n```\n\n## Action Rules\n\n### Default: DO NOT auto-reply directly, and do not spam main Telegram\nCollect intent and stay local unless something truly needs Vincent's immediate attention.\n\n**When you need Vincent to approve a reply, POST a draft to the sidecar (see \"Draft Approvals\" above).** The sidecar handles the Telegram notification with buttons, the WebApp for editing, and sending to WhatsApp on approve. Do **not** duplicate that draft or summary in the main Telegram chat.\n\nMain Telegram WhatsApp summaries are muted by default. Only use `send_message(target=\"telegram\", ...)` for urgent/important/actionable items that need Vincent now: VIP actionable messages/calls, missed/ringing family calls, emergencies, safety/medical/school-pickup issues, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors. Routine logistics, casual family pings, generic group events, links, standalone codes, normal confirmations, and suggested replies should be local-only or sidecar-only.\n\nDo NOT send WhatsApp replies directly unless Vincent explicitly approves them or the category is pre-approved.\n\n### Tone (when proposing or sending replies)\n- Casual, direct, no sign-offs (no \"Thanks & Regards\", no \"- Vincent\")\n- Ask questions instead of giving instructions\n- Keep it natural — like texting a friend\n- Christian and grace-based: warm, humble, encouraging, non-judgmental, peace-making\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- For prayer/birthday contexts, use gentle faith-filled language; avoid sounding preachy, performative, or overly formal\n\n### What to Do With Each Message\n\n**DMs (personal chats):**\n- Analyze intent\n- Notify Vincent on Telegram (via `send_message(target=\"telegram\", ...)`) only if important/actionable/time-sensitive, requires a decision, comes from a VIP contact\n- If you have a useful suggested reply, POST it as a draft to the sidecar instead of a Telegram notification — Vincent gets the buttons inline\n- Casual/social/non-actionable DMs → local log only, no Telegram notification, no draft\n- Do NOT reply unless Vincent says to\n- VIP contacts → **call Vincent immediately without asking**. Do NOT post a draft. Just call. During quiet hours (1am-10am SGT), suppress the call and send Telegram notification only, unless life-and-death emergency.\n\n**Group chats — events/meetups:**\n- If someone posts about an event, gathering, meetup, dinner, etc:\n 1. Extract: what, when, where, who\n 2. Notify Vincent on Telegram with details (no draft needed unless you also want to suggest an RSVP reply)\n 3. Ask: \"Want me to add this to Google Calendar and set a reminder?\"\n\n**Group chats — prayer requests / praying for others:**\n- If someone asks for prayer, shares a burden, illness, grief, struggle, thanksgiving, or asks the group to pray:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Identify who/what to pray for and any sensitive details\n 3. Draft a short, grace-based Christian reply/prayer\n 4. POST it to the sidecar with `summary` describing the request and `draft_text` being the prayer\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — birthday greetings:**\n- If you see birthday wishes for someone in a group:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Note who the birthday person is and relationship/context if inferable\n 3. Draft a warm, grace-based Christian birthday greeting\n 4. POST it to the sidecar\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — general:**\n- Casual banter, memes, forwards, broadcast spam → ignore silently\n- Religious content shares → ignore unless specifically addressed to Vincent\n\n### When to Notify Vincent (Telegram) vs Post a Draft\n\n**Use `send_message(target=\"telegram\", ...)` sparingly** for plain notifications only when the message needs Vincent's immediate attention: VIP actionable DMs/calls, emergencies, school pickup/child logistics requiring action now, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors.\n\n**Use the sidecar (POST `/drafts`)** whenever you have a suggested reply for Vincent to approve: prayer responses, birthday greetings, DM replies, group RSVPs. Do not also send the same summary to the main Telegram chat.\n\nDo **not** notify or draft for casual/social/non-actionable DMs, routine logistics, read receipts/status updates, religious shares not addressed to Vincent, memes, forwarded/broadcast/spam, generic group events, standalone codes, normal confirmations, or general group chatter. Family messages are not automatically main-chat alerts unless urgent, safety-related, school/pickup critical, or explicitly requiring Vincent now.\n\n## Intent Categories\n\nWhen analyzing incoming messages, classify as:\n\n1. **Urgent** — time-sensitive, emergencies → call Vincent outside quiet hours; during quiet hours use Telegram only unless life-and-death\n2. **Important** — decisions, money, appointments, anything consequential → call Vincent outside quiet hours; during quiet hours notify on Telegram only unless life-threatening\n3. **Request** — asking Vincent to do something → **notify Vincent** (Telegram) or post a draft if you have a safe reply\n4. **Enquiry** — asking for info, pricing, availability → auto-reply if context available, otherwise post a draft\n5. **Confirmation** — confirming plans, bookings → acknowledge and notify Vincent\n6. **Casual/Social** — greetings, memes, forwards → usually ignore or react\n7. **Broadcast/Spam** — forwarded job posts, chain messages → ignore\n\n## Finding Individual Contacts and JIDs\n\nWhen Vincent asks to monitor or prioritize a specific WhatsApp contact and only gives a name:\n\n1. Query contacts and chats, then search both display names and serialized message metadata:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n python3 - <<'PY'\n import os, requests, json\n base=os.environ['EVOLUTION_API_URL']; inst=os.environ['EVOLUTION_INSTANCE']; key=os.environ['EVOLUTION_API_KEY']\n headers={'apikey':key,'Content-Type':'application/json'}\n terms=['whye','keong'] # replace with lowercase search terms\n for ep in ['findContacts','findChats']:\n r=requests.post(f'{base}/chat/{ep}/{inst}', headers=headers, json={}, timeout=30)\n print('endpoint', ep, 'status', r.status_code)\n for obj in (r.json() if r.ok else []):\n if any(t in json.dumps(obj, ensure_ascii=False).lower() for t in terms):\n print(json.dumps(obj, ensure_ascii=False)[:1000])\n PY\n ```\n2. If `findChats` shows `remoteJid` ending in `@lid`, inspect `lastMessage.key.remoteJidAlt` for the phone JID. Store both in `memory/whatsapp-vip-contacts.json` when configuring VIP/call rules.\n3. Verify the number exists with `/chat/whatsappNumbers/{instance}` before finalizing.\n4. Validate the JSON config after editing: `python3 -m json.tool /opt/data/agent-state/memory/whatsapp-vip-contacts.json >/tmp/wa_vip_validated.json`.\n\n## Finding Group JIDs\n\nThe `group/fetchGroups` endpoint returns 404 on some Evolution API instances (including ours). To find a group JID when you don't have it:\n\n1. **Use `chat/findMessages` with an empty `where` filter** to pull recent messages across all chats, then extract unique `remoteJid` values containing `@g.us`.\n2. **Match by content**: Fetch messages from each group JID using `where.key.remoteJid` and match sender names or message text to identify the target group.\n3. **Known group JIDs** (store in `whatsapp-vip-contacts.json` as you discover them):\n - ps112 family group: `6591887488-1464061170@g.us`\n - Overcomers group: `6591444911-1614821287@g.us`\n - 神的华美 group: `6596756578-1405873840@g.us` (content-sharing, casual/broadcast, local log only)\n - Overcomers/TCians group: `120363230236878544@g.us` (church fellowship, PJ sends weekly attendance reports, routine attendance = local log only)\n\nExample search:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"6591444911-1614821287@g.us\"}}, \"limit\": 10, \"order\": \"desc\"}'\n```\n\n## Processing Image/Media Messages\n\nWhen an incoming WhatsApp message is an image (imageMessage), video, or other media with no text or only a caption:\n\n1. **Fetch conversation context** as usual with `wa_recent_messages.py`\n2. **Download the media** using the Evolution API:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"message\": {\"key\": {\"id\": \"MESSAGE_ID\", \"fromMe\": false, \"remoteJid\": \"GROUP_OR_CHAT_JID\"}}, \"convertToMp4\": false}' \\\n -o /tmp/wa_media_response.json\n ```\n Then decode: `python3 -c \"import json,base64; d=json.load(open('/tmp/wa_media_response.json')); open('/tmp/wa_media.jpg','wb').write(base64.b64decode(d['base64']))\"`\n3. **Analyze the image** using `vision_analyze` tool with `/tmp/wa_media.jpg` path and a context-aware question\n4. **Fallback if vision is unavailable**: Use `tesseract /tmp/wa_media.jpg stdout` for OCR, combined with pixel analysis (PIL/numpy) and conversation context to infer what the image contains\n5. **Album messages** (`albumMessage`): These bundle multiple images. Each image arrives as a separate `imageMessage` with a `messageAssociation.parentMessageKey` linking them to the album. Download each image separately. The `expectedImageCount` field tells how many images to expect\n6. **messageAssociation**: When images have `parentMessageKey` with `fromMe: true`, they are replying to a message Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an incoming WhatsApp message is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification when possible:\n\n1. **Fetch conversation context first**, especially in groups:\n ```bash\n python3 /opt/data/scripts/wa_recent_messages.py \"\" 50\n ```\n2. **Download the audio** with Evolution API, saving the response to a file:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_audio_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"GROUP_OR_CHAT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_audio_req.json -o /tmp/wa_audio_response.json\n python3 - <<'PY'\n import json, base64\n d=json.load(open('/tmp/wa_audio_response.json'))\n b=d.get('base64') or d.get('data',{}).get('base64')\n if ',' in b[:100]: b=b.split(',',1)[1]\n open('/tmp/wa_audio.ogg','wb').write(base64.b64decode(b))\n PY\n ```\n3. **Convert OGG/Opus to WAV**:\n ```bash\n ffmpeg -y -i /tmp/wa_audio.ogg -ar 16000 -ac 1 /tmp/wa_audio.wav\n ```\n4. **Transcribe fallback when Whisper is unavailable**: the container may not have `whisper`, `faster-whisper`, `torch`, or `speech_recognition` installed globally. A reliable lightweight fallback is a temp venv with SpeechRecognition and Google web speech:\n ```bash\n python3 -m venv /tmp/srvenv\n /tmp/srvenv/bin/pip install --quiet SpeechRecognition\n /tmp/srvenv/bin/python - <<'PY'\n import speech_recognition as sr\n r=sr.Recognizer()\n with sr.AudioFile('/tmp/wa_audio.wav') as source:\n audio=r.record(source)\n print(r.recognize_google(audio, language='en-SG'))\n PY\n ```\n5. **Check adjacent voice notes from the same sender**. WhatsApp voice notes are often sent in quick sequential chunks. Download and transcribe nearby `audioMessage` records from the same sender before deciding whether to notify Vincent.\n6. Use the transcript plus recent context to classify the message. For group meetup planning, extract what/when/where/who and notify Vincent if actionable, including transcript confidence.\n\n### Video Messages\n\nWhen an incoming WhatsApp message is `videoMessage` (with or without a caption):\n\n1. **Fetch conversation context** first, as with any media message: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n2. **Download the video** using Evolution API (same pattern as images/audio — save response to file, then decode):\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_video_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"CONTACT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_video_req.json -o /tmp/wa_video_response.json\n python3 - <<'PY'\n import json, base64\n d = json.load(open('/tmp/wa_video_response.json'))\n b = d.get('base64') or d.get('data', {}).get('base64', '')\n if ',' in b[:100]: b = b.split(',', 1)[1]\n open('/tmp/wa_video.mp4', 'wb').write(base64.b64decode(b))\n print(f\"Video saved: {len(base64.b64decode(b))} bytes\")\n PY\n ```\n3. **Extract representative frames** with ffmpeg (do NOT try to analyze video directly — vision tools work on images):\n ```bash\n # Frame at start\n ffmpeg -y -ss 0 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame0.jpg\n # Frame at ~1/3 of the way (good for short videos)\n ffmpeg -y -ss 3 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame3s.jpg\n # If video is longer (>15s), also extract at midpoint\n # ffprobe -v quiet -print_format json -show_format /tmp/wa_video.mp4 # get duration\n # ffmpeg -y -ss $HALF_DURATION -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame_mid.jpg\n ```\n4. **Analyze the frames** with `vision_analyze`. Ask specific questions about content: what objects/people/text are visible, what type of video this appears to be (screen recording, personal clip, forwarded meme, etc.), and any text or identifying details.\n5. **Use the caption** from `videoMessage.caption` if present (can be empty/null). Combine frame analysis + caption + conversation context to classify the message intent.\n6. **Transcription of audio in video** is generally not feasible with available tools. If the video likely contains important spoken content (e.g. voice memo style), note this limitation and flag it to Vincent.\n\n**Key points:**\n- Always use `-o` flag and file-based decode; videos can be 1-10MB and base64 responses will be even larger\n- `--max-time 25` on curl is essential for video downloads\n- Short casual clips (showing a key fob, a parking spot, quick view of surroundings) are usually casual/social — classify accordingly and don't over-escalate\n- If `vision_analyze` fails (400 on models that don't support images, or 429 rate limit), fall back to conversation context inference alone\n\n### Poll Messages\nWhen `pollUpdateMessage` appears in chat history (someone voted), the message contains a `pollCreationMessageKey` with the original poll's message `id`. To get the poll question and options, query:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"id\": \"POLL_MESSAGE_ID\", \"remoteJid\": \"GROUP_JID\"}}, \"limit\": 1}'\n```\nThe response will have `pollCreationMessageV3` with `name` (the poll question) and `options` array. Always look up the poll when votes or poll references appear, since the poll text contains the event details.\n\n### `associatedChildMessage` (Motion Photos / Companion Media)\nWhatsApp motion photos and companion media arrive as `associatedChildMessage` type. These are short videos or supplementary images attached to a parent message via `messageAssociation.parentMessageKey`. The structure is:\n\n```\nmessage: {\n messageContextInfo: {\n messageAssociation: {\n associationType: 12,\n parentMessageKey: { id, fromMe, remoteJid }\n }\n },\n associatedChildMessage: {\n message: {\n videoMessage: { ... } // or imageMessage: { ... }\n }\n }\n}\n```\n\nKey points:\n- The `parentMessageKey.id` links back to the main message (the image/video they accompany)\n- When `parentMessageKey.fromMe: true`, the companion media is replying to a message Vincent sent\n- Captions may appear on the child's `videoMessage.caption` or `imageMessage.caption` inside the nested message, NOT at the top level\n- These often contain motion-photo short videos (3 seconds) or album companion images\n- Treat them as media attachments to the parent; don't process them as standalone messages requiring independent intent analysis\n\n### Sequential Image Sets (Non-Album)\nMultiple images sent rapidly by the same sender at similar timestamps may NOT be wrapped in an `albumMessage`. They appear as separate `imageMessage` entries. Always check for adjacent images from the same sender and download+OCR all of them together, as they often form a coherent multi-page set (e.g., route info, comparison tables, maps). Missing one can lose important context.\n\n### Key pitfalls\n- The Evolution API `getBase64FromMediaMessage` endpoint returns the full base64 payload (can be 100KB+ for images). Use `-o` flag to save to file, then parse with Python. Do NOT try to pipe it directly\n- `curl` with `--max-time 25` is essential; without it, large media responses can hang\n- The `albumMessage` type itself has no `fromMe` field; check each individual `imageMessage` instead\n- Image thumbnails in the webhook payload are truncated; always download the full image for analysis\n- **WhatsApp media URLs expire**: CDN download links (`mmg.whatsapp.net`) expire after roughly 24-48 hours. When `getBase64FromMediaMessage` returns a 400/404 error like \"Failed to fetch stream from https://mmg.whatsapp.net/...\", the media is no longer retrievable via the API. This means you CANNOT download adjacent/older voice notes, images, or documents for context if they're more than a day or two old. Fall back to conversation text context only. Fresh/incoming messages are almost always still downloadable.\n- **Voice note transcription reliability**: Short voice notes (<15s), especially in Singlish/Chinese-English mix common in Singapore contacts, often produce unreliable transcriptions via SpeechRecognition+Google. Always flag low-confidence transcriptions to Vincent rather than acting on potentially garbled text.\n- **Vision API failures**: `vision_analyze` can fail with 429 (rate limit) **or 400** (model doesn't support image input, e.g. glm-5.1 via ollama-cloud). Always have a fallback plan. When vision is unavailable, try tesseract OCR, then pixel analysis (PIL/numpy for brightness zones, text region detection via row-level dark-pixel density)\n- **OCR can fail on decorative/complex images**: Obituaries, memorial notices, religious flyers with ornate fonts, or dark-background images often defeat tesseract even with aggressive preprocessing (upscaling, thresholding, inverting, PSM mode variations). When OCR returns gibberish or empty, fall back to conversation context inference: use preceding/surrounding messages to determine what the image likely contains, then draft an appropriate response based on that context\n- **OCR on devotional/dark-background images**: These often defeat tesseract on full-image scans. Use a multi-strategy approach: (1) run tesseract on the full image first, (2) if text is sparse, crop into regions (top 15%, middle, bottom 15%), upscale 2-3x, enhance contrast (2-3x), and try both normal and inverted (`ImageOps.invert`) versions with `--psm 6`, (3) combine all partial OCR results. For Exodus/verse images, the title (e.g. \"WHEN MOSES' HANDS WERE\") is often in decorative font that OCR misses, while the verse body text is usually readable.\n- **Prioritize conversation context**: When processing group images, always fetch recent context FIRST (before attempting vision/OCR). If image analysis fails, the conversation history often makes the image's purpose clear enough to draft a meaningful reply\n- **DM conversation context is one-sided (fromMe only)**: The Evolution API `findMessages` endpoint often returns only Vincent's sent messages (`fromMe: true`) for DM chats, not the contact's incoming messages. This means `wa_recent_messages.py` can show only one side of the conversation. When processing a DM webhook, do NOT rely solely on fetched history for the contact's side. The webhook payload already contains the incoming message text. Use what the webhook gives you plus Vincent's side from the API. If you need the contact's recent messages, check if `fromMe: false` records exist; if none are returned, the API simply doesn't have them and you must work with what you have (webhook message + Vincent's replies for context).\n- **`findMessages` ignores `key.fromMe` filter**: Passing `\"key.fromMe\": false` in the `where` clause still returns all messages (including `fromMe: true`). To find messages from a specific sender, fetch results and filter client-side by checking `fromMe` in Python.\n- **`findMessages` caps at ~50 records per request**: Even with `limit: 200` or `limit: 500`, the API returns at most 50 records. When one party sends many consecutive messages, the other party's messages may not appear in the first 50 results. Use pagination or fetch multiple pages if you need older messages.\n- **Video frame extraction**: Use `ffmpeg -y -ss N -i video.mp4 -frames:v 1 -update 1 output.jpg` (the `-update 1` flag is needed; without it ffmpeg treats the output as an image sequence and fails). Extract frames at multiple timestamps (start, ~1/3, midpoint for longer videos) since a single frame may miss the key content.\n- **ffmpeg \"Invalid color space\" failure on WhatsApp videos**: Some WhatsApp H.264 videos have `color_space=reserved` and `color_primaries=reserved` metadata that ffmpeg cannot decode at all. Every codec/format/filter combination fails with \"Invalid color space\" / return code 234. This is NOT fixable with `-vf setparams`, `-color_trc`, `-colorspace`, `-color_primaries`, or `-vf format=` flags because the error occurs at the decoder level before any filter runs. **Do not waste time trying ffmpeg workarounds.** When this happens, fall back to: (1) conversation context inference (who sent it, what thread it's in, adjacent messages), (2) the jpegThumbnail embedded in the videoMessage (small but gives a rough idea; base64-encoded JPEG, typically 600-900 bytes, use `vision_analyze` on it if available), or (3) if all analysis fails, just classify based on surrounding text context.\n- **`wa_recent_messages.py` JSON output breaks `json.loads()`**: The script's stdout contains WhatsApp message content with control characters (emoji, CJK, deviceListMetadata binary blobs) that cause `json.loads()` to fail with `Invalid control character` or `Expecting ',' delimiter` errors. Even `strict=False` does not reliably fix it. **Reliable approach**: use `curl -o /tmp/wa_msgs.json` to save the Evolution API response directly to a file (the file write handles binary safely), then parse with `json.load(open('/tmp/wa_msgs.json'))` in Python. Example:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 15 -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"CONTACT_JID\"}}, \"limit\": 20}' -o /tmp/wa_msgs.json\n ```\n Then in Python: `data = json.load(open('/tmp/wa_msgs.json')); records = data['messages']['records']`\n\n## API Reference\n\nFor full API endpoints (media, contacts, location, buttons) — see [references/api.md](references/api.md).\n\n[Skill directory: /opt/data/skills/personal/whatsapp-evolution]\nResolve any relative paths in this skill (e.g. `scripts/foo.js`, `templates/config.yaml`) against that directory, then run them with the terminal tool using the absolute path.\n\n[This skill has supporting files:]\n- references/api.md -> /opt/data/skills/personal/whatsapp-evolution/references/api.md\n- scripts/process_queue.py -> /opt/data/skills/personal/whatsapp-evolution/scripts/process_queue.py\n- scripts/send_text.sh -> /opt/data/skills/personal/whatsapp-evolution/scripts/send_text.sh\n\nLoad any of these with skill_view(name=\"personal/whatsapp-evolution\", file_path=\"\"), or run scripts directly by absolute path (e.g. `node /opt/data/skills/personal/whatsapp-evolution/scripts/foo.js`).\n\nThe user has provided the following instruction alongside the skill invocation: Incoming WhatsApp webhook received.\n\nEvent: messages.upsert\nInstance: n8n-WA\nSender: Hui Xian\nSender JID: 120363418226875471@g.us\nFrom me: False\nMessage type: reactionMessage\nMessage: {\n \"event\": \"messages.upsert\",\n \"instance\": \"n8n-WA\",\n \"data\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": false,\n \"id\": \"3A05CB83ADB590CEE365\",\n \"participant\": \"176888278429759@lid\",\n \"participantAlt\": \"6596725405@s.whatsapp.net\",\n \"addressingMode\": \"lid\"\n },\n \"pushName\": \"Hui Xian\",\n \"status\": \"DELIVERY_ACK\",\n \"message\": {\n \"reactionMessage\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": true,\n \"id\": \"AC37643983DC88906B4D0F1A09F74082\",\n \"participant\": \"54644919296042@lid\"\n },\n \"text\": \"\\ud83d\\ude4f\\ud83c\\udffc\",\n \"senderTimestampMs\": {\n \"low\": -1593531232,\n \"high\": 414,\n \"unsigned\": false\n }\n }\n },\n \"messageType\": \"reactionMessage\",\n \"messageTimestamp\": 1780817896,\n \"instanceId\": \"b2bfaa30-d5f1-4b22-b979-791892619414\",\n \"source\": \"ios\"\n },\n \"destination\": \"https://jimmy.huper.asia/webhooks/whatsapp-incoming\",\n \"date_time\": \"2026-06-07T04:38:17.221Z\",\n \"sender\": \"6590054438@s.whatsapp.net\",\n \"server_url\": \"https://evolution.huper.asia\",\n \"apikey\": \"5E4C532223A0-460B-A749-16C584F56E74\"\n}\n\nPrimary goal: REDUCE NOISE. Stay silent in the main Telegram chat unless the message truly needs Vincent's immediate attention. Approval-worthy reply drafts should go to the jimmy-wa-drafts sidecar, NOT to the main Telegram chat.\n\nFollow the whatsapp-evolution skill and /opt/data/agent-state/memory/whatsapp-vip-contacts.json, with these stricter notification rules:\n\nABSOLUTE SILENCE RULES:\n- If fromMe is true: ignore. Do NOT call send_message. Final response local log only.\n- Ignore status/update/read-receipt events or payloads with no real message content.\n- Do NOT send main Telegram summaries for ordinary DMs, routine logistics, casual family messages, generic group events, group chatter, forwards, memes, devotional/religious shares, links, standalone codes, normal confirmations, or suggested replies.\n- Do NOT ask \"Want me to add this to Calendar?\" for WhatsApp group events unless Vincent is explicitly invited, named, responsible, or the event is time-sensitive and clearly relevant to him today.\n- Do NOT surface family messages just because they are family. Only surface if urgent, safety-related, school/pickup critical, money/legal/medical, or explicitly requires Vincent now.\n\nSIDEcar RULES:\n- If a reply may be useful but it is not urgent, POST it to https://jimmy-draft.huper.asia/drafts using the whatsapp-evolution Draft Approvals flow. Do NOT duplicate that draft/summary in the main Telegram chat.\n- Prayer and birthday replies should go to the sidecar when appropriate, not the main Telegram chat, unless the underlying situation is urgent/sensitive and Vincent needs immediate awareness.\n- Before drafting any reply or approval suggestion, fetch recent context: python3 /opt/data/scripts/wa_recent_messages.py \"120363418226875471@g.us\" 30. For active/ambiguous groups use 50.\n\nMAIN TELEGRAM ALERTS ARE ALLOWED ONLY FOR:\n1. VIP contacts marked always_call or always_notify when the message is important/actionable, or any VIP call event.\n2. Calls from family/VIP that appear missed/ringing and likely need immediate attention. Avoid repeated alerts for accepted/terminated duplicate call state updates.\n3. Emergencies, safety issues, medical issues, school pickup/child logistics requiring action now, legal/money/account/security matters, or time-sensitive commitments in the next few hours.\n4. Messages explicitly asking Vincent to call back, decide now, approve payment, provide OTP/code for an active task, or respond urgently.\n5. Pipeline errors or webhook processing failures.\n\nCALL RULES:\n- VIP contacts (e.g. Dn Jeff, Ng Whye Keong): call Vincent immediately outside quiet hours when actionably important or on ringing/missed call. During 1am-10am SGT, Telegram only unless life-and-death emergency.\n- Do not call for casual VIP confirmations or non-actionable courtesy messages.\n\nTONE/RULES FOR DRAFTS:\n- Proposed replies must be Christian and grace-based where appropriate: warm, humble, encouraging, non-judgmental, peace-making, not preachy.\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- NEVER auto-reply directly on WhatsApp unless Vincent explicitly approves or future config marks that exact category pre-approved.\n\nDelivery rule:\n- For allowed main alerts only: call send_message(target=\"telegram\", message=...). Keep it concise.\n- For non-urgent reply suggestions: use the sidecar only.\n- For everything else: DO NOT call send_message. Keep final response as local-only log.\n", "timestamp": 1780817934.9079058, "anchor": true}, {"id": 104153, "role": "assistant", "content": "**Local log only.** Hui Xian reacted 🙏🏽 to a message Vincent sent in group `120363418226875471@g.us`. Reaction messages have no conversational content and fall under the silence rule for status/read-receipt-type events. No notification, no draft, no action.", "timestamp": 1780817934.9237528}], "bookend_end": [], "messages_before": 0, "messages_after": 2}, {"session_id": "20260607_070528_af24596b", "when": "June 07, 2026 at 07:05 AM", "source": "webhook", "model": "glm-5.1", "title": "WhatsApp TikTok Link Share", "matched_role": "user", "match_message_id": 104149, "snippet": "...true`, they are replying to a >>>message<<< Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an >>>incoming<<< WhatsApp >>>message<<< is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification...", "bookend_start": [], "messages": [{"id": 104149, "role": "user", "content": "[IMPORTANT: The user has invoked the \"whatsapp-evolution\" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]\n\n---\nname: whatsapp-evolution\ndescription: WhatsApp messaging via Evolution API — send, receive, analyze, and auto-reply. Use when processing an incoming WhatsApp message delivered via the webhook, replying on Vincent's behalf, liaising with contacts, sending messages, or handling WhatsApp-related tasks. Covers intent analysis, auto-reply rules, escalation (VIP → phone call), conversation context fetching, and draft approvals via the jimmy-wa-drafts sidecar.\n---\n\n# WhatsApp via Evolution API\n\n## Credentials\nStored in `/opt/data/secrets/.whatsapp-evolution-credentials`:\n- `EVOLUTION_API_URL`, `EVOLUTION_API_KEY`, `EVOLUTION_INSTANCE`\n\nLoad with: `export $(cat .whatsapp-evolution-credentials | xargs)`\n\n## Sending Messages (direct, no approval)\n\n```bash\nexport $(cat .whatsapp-evolution-credentials | xargs)\ncurl -s -X POST \"${EVOLUTION_API_URL}/message/sendText/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"number\": \"6591234567\", \"text\": \"Hello!\"}'\n```\n\n- Phone format: country code, no `+` (e.g. `6590054438`)\n- Groups: use JID `{groupid}@g.us`\n- For media/location/contacts — see [references/api.md](references/api.md)\n\nUse direct send only when Vincent has explicitly commanded/approved the exact send, or for pre-approved auto-replies/follow-ups. For any reply that needs Vincent's review, use the sidecar (next section).\n\n### Direct replies when Vincent names an existing WhatsApp thread\n\nIf Vincent says to “reply to X” or “confirm with X” and has clearly approved the action, but the contact JID/number is not in the current prompt:\n\n1. Search recent session history first (`session_search`) for the contact name, clinic/business name, or appointment details. Prior webhook summaries often contain the exact `remoteJid`, alternate phone JID, sender name, and context.\n2. If session history identifies the WhatsApp phone JID, send directly only if Vincent’s instruction is explicit enough to act. Keep the text concise and natural.\n3. If only a LID JID is found, look for `remoteJidAlt`/alternate phone JID in the session summary or fetch contacts/chats before sending.\n4. Do not invent phone numbers from web search when replying to an existing WhatsApp conversation. Use web search only as supporting context, not as the destination, unless Vincent explicitly asks to contact a public number.\n\n\n### Sending trip/camp details from retrieved emails\n\nWhen Vincent explicitly asks to send extracted school trip/camp details to a child or family member, send directly. Keep it practical and grounded:\n- Include dates, reporting/pick-up times, flights, venue/itinerary highlights, deadlines, forms, and safety/medication notes when found.\n- If an official packing list is missing, say that clearly. Do not invent one. You may add a short “based on itinerary/weather, prepare…” list, clearly labeled as practical guidance rather than official school instructions.\n- Avoid long explanations about the search process in WhatsApp. Send the useful details.\n\n### Sending a local image/file as media\n\nFor user-approved media sends, base64-encode the local file and call `sendMedia`:\n\n```bash\nexport $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\npython3 - <<'PY'\nimport base64, json\npath = '/tmp/file.png'\npayload = {\n 'number': '6591234567',\n 'mediatype': 'image',\n 'mimetype': 'image/png',\n 'caption': 'Caption here',\n 'media': base64.b64encode(open(path, 'rb').read()).decode(),\n 'fileName': 'file.png',\n}\nopen('/tmp/wa_media_payload.json', 'w').write(json.dumps(payload))\nPY\ncurl -s --max-time 60 -X POST \"${EVOLUTION_API_URL}/message/sendMedia/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_media_payload.json\n```\n\nWhen sending QR tickets or passes, verify the crop/image contains the full QR code and the intended attendee name before sending; the QR screenshot is effectively the redeemable ticket.\n\n## Fetching Conversation Context\n\nBefore drafting or proposing any reply, always fetch recent chat history for context:\n\n```bash\npython3 /opt/data/scripts/wa_recent_messages.py \"6591234567@s.whatsapp.net\" 30\n```\n\nUse `limit=30` by default. For active group chats or ambiguous situations, fetch up to 50 messages. Read enough context to understand who is being discussed, current tone, previous commitments, and whether Vincent has already replied. The script returns newest messages first (descending order).\n\n## Draft Approvals — `jimmy-wa-drafts` sidecar\n\nWhen you need Vincent to approve a WhatsApp reply, POST the draft to the sidecar. It owns the entire approval flow: Telegram message with 4 inline buttons, Telegram WebApp for Edit/Context, sending to WhatsApp via Evolution API on Approve.\n\n### Posting a draft\n\nAlways include an `extras` object with the original WA context. The sidecar passes `extras` through verbatim if Vincent later clicks \"Context\" — without it, your redraft will be flying blind.\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\n \"contact\": \"120363351787895353@g.us\",\n \"contact_name\": \"Cycling group\",\n \"summary\": \"Karen sent a blessing over the bike ride; she's not joining\",\n \"draft_text\": \"Thanks Karen, will miss you both — praying for a good time too 🙏\",\n \"extras\": {\n \"group_jid\": \"120363351787895353@g.us\",\n \"sender_name\": \"Karen Tan\",\n \"sender_jid\": \"6597630553@s.whatsapp.net\",\n \"category\": \"prayer/blessing\",\n \"original_message\": \"Happy cycling, to those who can make it… Dn Jeff and I had made plans for this PM so we won't be able to join y'all. Praying the Lord's favor, protection and good-weather blessings be upon y'all and your route!\",\n \"extracted\": {\n \"what\": \"Cycling from East Coast Park to Changi Village, then dinner\",\n \"when\": \"Today, 4:00 PM\",\n \"where\": \"East Coast Park to Changi Village\",\n \"who\": \"Group members who voted yes for cycling/dinner or dinner\"\n },\n \"confidence\": \"high\",\n \"why_appropriate\": \"Multiple group messages are blessings/wishes; Vincent has not replied; tone matches grace-based norm.\",\n \"wa_message_id\": \"ABC123XYZ\"\n }\n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\nResponse: `{\"draft_id\":\"abc123def\",\"telegram_message_id\":42}`\n\nWhat to put in `extras` (use whatever makes sense for the message type):\n\n- **Always**: `original_message` (verbatim WA text), `sender_name`, `category` (e.g. `prayer`, `birthday`, `event`, `dm-reply`, `enquiry`)\n- **Group messages**: `group_jid`, `sender_jid`\n- **DMs**: `sender_jid`\n- **Strongly recommended (drives the rich Telegram render)**: `extracted` (dict of who/what/when/where), `confidence` (`high`/`medium`/`low`), `why_appropriate` (one-line reasoning shown to Vincent so he can decide)\n- **Optional**: `wa_message_id`, `tone_hint` (e.g. `warm-grace-based`), `sender_phone` (auto-derived from `sender_jid` if omitted)\n\nVincent receives a Telegram message in the `@jimmy_WA_bot` chat with 4 inline buttons. The sidecar handles every interaction; **you do not track replies, callbacks, or message IDs**. Just POST and move on.\n\n| Button | Behavior |\n|--------|----------|\n| **Approve** | Sidecar sends draft to WhatsApp |\n| **Edit** | Opens a Telegram WebApp with draft pre-filled; Vincent edits + sends; sidecar pushes to WhatsApp |\n| **Context** | Opens a Telegram WebApp with empty textarea + the original WA message visible; Vincent provides context. (When `HERMES_REDRAFT_WEBHOOK` is wired in the sidecar's env, this will POST back to Jimmy for a redraft. Until then, the context is just stored on the draft.) |\n| **Pass** | Sidecar deletes the draft |\n\n### JSON escaping when `draft_text` contains quotes\n\nSingle quotes (`It's`, `God's`) break shell interpolation. Write the JSON to a temp file first:\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\"contact\":\"6591234567@s.whatsapp.net\",\"contact_name\":\"John\",\"summary\":\"…\",\"draft_text\":\"It's nice to hear from you!\"}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\n### Networking\n\n`https://jimmy-draft.huper.asia` is the public HTTPS endpoint of the sidecar (TLS-terminated by Traefik on ubuntu-server). HTTPS is required because Hermes blocks plain-HTTP commands via tirith. The internal Docker DNS hostname http://jimmy-wa-v64mqe:8080 also works for non-shell calls but the agent is forbidden from using it.\n\n### Sidecar status & logs\n\n- Source: (auto-deploys on push)\n- Public URL: (`/health` for ping)\n- Container: `jimmy-wa-v64mqe.1.` on ubuntu-server\n- Bot: `@jimmy_WA_bot`\n\n## Handling a Redraft Trigger (`wa-redraft` webhook)\n\nWhen Vincent clicks the **Context** button on a draft, the sidecar:\n1. Edits the original Telegram message to \"🔄 Adding context for X… agent is redrafting.\" (removes the old buttons so Vincent can't approve the now-stale text)\n2. POSTs to your `wa-redraft` webhook (`http://localhost:8644/webhooks/wa-redraft`) with the original draft fields plus Vincent's `context` instructions and the `extras` you stored when you first created the draft.\n\nWhen you receive a `wa-redraft` activation, the prompt template will lay out the fields. Your job:\n\n1. Read Vincent's `context` carefully — that's the steering signal. The `previous_draft` is just history, not necessarily a starting point.\n2. Re-fetch recent WA history if the original context isn't enough (`python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`). Use `extras.group_jid` for groups or `contact` for DMs.\n3. Write a NEW draft that addresses Vincent's instructions while staying faithful to the original WA context (in `extras.original_message`, `extras.extracted`).\n4. POST the new draft back to the sidecar **with the same `draft_id`** so the existing draft is updated, not duplicated:\n\n```bash\ncat > /tmp/wa_redraft.json <\",\n \"contact\": \"\",\n \"contact_name\": \"\",\n \"summary\": \"\",\n \"draft_text\": \"\",\n \"extras\": \n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_redraft.json\n```\n\nThe sidecar will send a NEW Telegram message with 4 fresh buttons. Vincent can Approve, Edit, Context-again (loop), or Pass.\n\nDo **not** send to WhatsApp directly during a redraft — the approval flow is the sidecar's job.\n\n## Incoming Message Processing\n\n### Delivery Pipeline (current)\n1. Evolution API POSTs each `MESSAGES_UPSERT` webhook directly to Hermes at `http://localhost:8644/webhooks/whatsapp-incoming` (proxied via `jimmy.huper.asia/webhooks/whatsapp-incoming`)\n2. Hermes webhook route (`whatsapp-incoming`, dynamic subscription) validates and spawns an agent\n3. Agent loads `whatsapp-evolution` skill, follows the rules below\n4. Webhook subscription uses `deliver: log` because webhooks do not support `deliver: local`; the agent explicitly calls `send_message(target=\"telegram\", ...)` only for important/actionable/VIP/event/birthday/prayer/error items, OR posts a draft to the sidecar for approval items.\n\n**No more OpenClaw, no Python webhook server, no port 5001, no keepalive cron.** If messages aren't being processed, check: nginx proxy to port 8644, `hermes webhook list` for the subscription, Evolution API webhook URL setting.\n\n## Contact Rules & Escalation\n\nConfig: `memory/whatsapp-vip-contacts.json`\n\n### Priority Levels\n\n| Priority | Action | Example |\n|----------|--------|---------|\n| **VIP (call-always)** | **CALL Vincent immediately via Twilio without asking.** Do NOT ask for approval first. During quiet hours (1am-10am SGT) suppress call and use Telegram notification only, unless life-and-death emergency. Match VIPs by both phone JID (`659...@s.whatsapp.net`) and LID JID (`...@lid`) when present, because recent chats may arrive with `key.remoteJid` as an LID and the phone number only in `key.remoteJidAlt`. | Dn Jeff (+6597630553) |\n| **Family** | Always notify Vincent on Telegram | ps112 group, family members |\n| **Auto-reply** | Reply based on context file, notify Vincent | Repair shops |\n| **Normal** | Analyze intent, reply if appropriate, notify if important | Everyone else |\n| **Broadcast/Spam** | Ignore silently | Forwarded job posts, chain messages |\n\n### Quiet Hours\n- 1am-10am SGT (17:00-02:00 UTC): avoid phone calls, including for Dn Jeff / VIP call-always contacts\n- **Exception:** life-and-death emergencies always call regardless of quiet hours\n- During quiet hours, use Telegram notification only and mention that a call was suppressed due to quiet hours\n\n## Making a Call via Twilio\n\nCredentials: `/opt/data/secrets/.twilio-credentials-new` (`TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN`, `TWILIO_PHONE_NUMBER`). From: `+16614809287`, Vincent: `+6590054438`.\n\nPreferred script: `/opt/data/scripts/call_vincent.py`, now using the Twilio Python SDK baked into the container.\n\n```bash\npython3 /opt/data/scripts/call_vincent.py \"Hello Vincent, this is Jimmy. Urgent summary here.\"\n```\n\n## Action Rules\n\n### Default: DO NOT auto-reply directly, and do not spam main Telegram\nCollect intent and stay local unless something truly needs Vincent's immediate attention.\n\n**When you need Vincent to approve a reply, POST a draft to the sidecar (see \"Draft Approvals\" above).** The sidecar handles the Telegram notification with buttons, the WebApp for editing, and sending to WhatsApp on approve. Do **not** duplicate that draft or summary in the main Telegram chat.\n\nMain Telegram WhatsApp summaries are muted by default. Only use `send_message(target=\"telegram\", ...)` for urgent/important/actionable items that need Vincent now: VIP actionable messages/calls, missed/ringing family calls, emergencies, safety/medical/school-pickup issues, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors. Routine logistics, casual family pings, generic group events, links, standalone codes, normal confirmations, and suggested replies should be local-only or sidecar-only.\n\nDo NOT send WhatsApp replies directly unless Vincent explicitly approves them or the category is pre-approved.\n\n### Tone (when proposing or sending replies)\n- Casual, direct, no sign-offs (no \"Thanks & Regards\", no \"- Vincent\")\n- Ask questions instead of giving instructions\n- Keep it natural — like texting a friend\n- Christian and grace-based: warm, humble, encouraging, non-judgmental, peace-making\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- For prayer/birthday contexts, use gentle faith-filled language; avoid sounding preachy, performative, or overly formal\n\n### What to Do With Each Message\n\n**DMs (personal chats):**\n- Analyze intent\n- Notify Vincent on Telegram (via `send_message(target=\"telegram\", ...)`) only if important/actionable/time-sensitive, requires a decision, comes from a VIP contact\n- If you have a useful suggested reply, POST it as a draft to the sidecar instead of a Telegram notification — Vincent gets the buttons inline\n- Casual/social/non-actionable DMs → local log only, no Telegram notification, no draft\n- Do NOT reply unless Vincent says to\n- VIP contacts → **call Vincent immediately without asking**. Do NOT post a draft. Just call. During quiet hours (1am-10am SGT), suppress the call and send Telegram notification only, unless life-and-death emergency.\n\n**Group chats — events/meetups:**\n- If someone posts about an event, gathering, meetup, dinner, etc:\n 1. Extract: what, when, where, who\n 2. Notify Vincent on Telegram with details (no draft needed unless you also want to suggest an RSVP reply)\n 3. Ask: \"Want me to add this to Google Calendar and set a reminder?\"\n\n**Group chats — prayer requests / praying for others:**\n- If someone asks for prayer, shares a burden, illness, grief, struggle, thanksgiving, or asks the group to pray:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Identify who/what to pray for and any sensitive details\n 3. Draft a short, grace-based Christian reply/prayer\n 4. POST it to the sidecar with `summary` describing the request and `draft_text` being the prayer\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — birthday greetings:**\n- If you see birthday wishes for someone in a group:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Note who the birthday person is and relationship/context if inferable\n 3. Draft a warm, grace-based Christian birthday greeting\n 4. POST it to the sidecar\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — general:**\n- Casual banter, memes, forwards, broadcast spam → ignore silently\n- Religious content shares → ignore unless specifically addressed to Vincent\n\n### When to Notify Vincent (Telegram) vs Post a Draft\n\n**Use `send_message(target=\"telegram\", ...)` sparingly** for plain notifications only when the message needs Vincent's immediate attention: VIP actionable DMs/calls, emergencies, school pickup/child logistics requiring action now, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors.\n\n**Use the sidecar (POST `/drafts`)** whenever you have a suggested reply for Vincent to approve: prayer responses, birthday greetings, DM replies, group RSVPs. Do not also send the same summary to the main Telegram chat.\n\nDo **not** notify or draft for casual/social/non-actionable DMs, routine logistics, read receipts/status updates, religious shares not addressed to Vincent, memes, forwarded/broadcast/spam, generic group events, standalone codes, normal confirmations, or general group chatter. Family messages are not automatically main-chat alerts unless urgent, safety-related, school/pickup critical, or explicitly requiring Vincent now.\n\n## Intent Categories\n\nWhen analyzing incoming messages, classify as:\n\n1. **Urgent** — time-sensitive, emergencies → call Vincent outside quiet hours; during quiet hours use Telegram only unless life-and-death\n2. **Important** — decisions, money, appointments, anything consequential → call Vincent outside quiet hours; during quiet hours notify on Telegram only unless life-threatening\n3. **Request** — asking Vincent to do something → **notify Vincent** (Telegram) or post a draft if you have a safe reply\n4. **Enquiry** — asking for info, pricing, availability → auto-reply if context available, otherwise post a draft\n5. **Confirmation** — confirming plans, bookings → acknowledge and notify Vincent\n6. **Casual/Social** — greetings, memes, forwards → usually ignore or react\n7. **Broadcast/Spam** — forwarded job posts, chain messages → ignore\n\n## Finding Individual Contacts and JIDs\n\nWhen Vincent asks to monitor or prioritize a specific WhatsApp contact and only gives a name:\n\n1. Query contacts and chats, then search both display names and serialized message metadata:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n python3 - <<'PY'\n import os, requests, json\n base=os.environ['EVOLUTION_API_URL']; inst=os.environ['EVOLUTION_INSTANCE']; key=os.environ['EVOLUTION_API_KEY']\n headers={'apikey':key,'Content-Type':'application/json'}\n terms=['whye','keong'] # replace with lowercase search terms\n for ep in ['findContacts','findChats']:\n r=requests.post(f'{base}/chat/{ep}/{inst}', headers=headers, json={}, timeout=30)\n print('endpoint', ep, 'status', r.status_code)\n for obj in (r.json() if r.ok else []):\n if any(t in json.dumps(obj, ensure_ascii=False).lower() for t in terms):\n print(json.dumps(obj, ensure_ascii=False)[:1000])\n PY\n ```\n2. If `findChats` shows `remoteJid` ending in `@lid`, inspect `lastMessage.key.remoteJidAlt` for the phone JID. Store both in `memory/whatsapp-vip-contacts.json` when configuring VIP/call rules.\n3. Verify the number exists with `/chat/whatsappNumbers/{instance}` before finalizing.\n4. Validate the JSON config after editing: `python3 -m json.tool /opt/data/agent-state/memory/whatsapp-vip-contacts.json >/tmp/wa_vip_validated.json`.\n\n## Finding Group JIDs\n\nThe `group/fetchGroups` endpoint returns 404 on some Evolution API instances (including ours). To find a group JID when you don't have it:\n\n1. **Use `chat/findMessages` with an empty `where` filter** to pull recent messages across all chats, then extract unique `remoteJid` values containing `@g.us`.\n2. **Match by content**: Fetch messages from each group JID using `where.key.remoteJid` and match sender names or message text to identify the target group.\n3. **Known group JIDs** (store in `whatsapp-vip-contacts.json` as you discover them):\n - ps112 family group: `6591887488-1464061170@g.us`\n - Overcomers group: `6591444911-1614821287@g.us`\n - 神的华美 group: `6596756578-1405873840@g.us` (content-sharing, casual/broadcast, local log only)\n - Overcomers/TCians group: `120363230236878544@g.us` (church fellowship, PJ sends weekly attendance reports, routine attendance = local log only)\n\nExample search:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"6591444911-1614821287@g.us\"}}, \"limit\": 10, \"order\": \"desc\"}'\n```\n\n## Processing Image/Media Messages\n\nWhen an incoming WhatsApp message is an image (imageMessage), video, or other media with no text or only a caption:\n\n1. **Fetch conversation context** as usual with `wa_recent_messages.py`\n2. **Download the media** using the Evolution API:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"message\": {\"key\": {\"id\": \"MESSAGE_ID\", \"fromMe\": false, \"remoteJid\": \"GROUP_OR_CHAT_JID\"}}, \"convertToMp4\": false}' \\\n -o /tmp/wa_media_response.json\n ```\n Then decode: `python3 -c \"import json,base64; d=json.load(open('/tmp/wa_media_response.json')); open('/tmp/wa_media.jpg','wb').write(base64.b64decode(d['base64']))\"`\n3. **Analyze the image** using `vision_analyze` tool with `/tmp/wa_media.jpg` path and a context-aware question\n4. **Fallback if vision is unavailable**: Use `tesseract /tmp/wa_media.jpg stdout` for OCR, combined with pixel analysis (PIL/numpy) and conversation context to infer what the image contains\n5. **Album messages** (`albumMessage`): These bundle multiple images. Each image arrives as a separate `imageMessage` with a `messageAssociation.parentMessageKey` linking them to the album. Download each image separately. The `expectedImageCount` field tells how many images to expect\n6. **messageAssociation**: When images have `parentMessageKey` with `fromMe: true`, they are replying to a message Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an incoming WhatsApp message is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification when possible:\n\n1. **Fetch conversation context first**, especially in groups:\n ```bash\n python3 /opt/data/scripts/wa_recent_messages.py \"\" 50\n ```\n2. **Download the audio** with Evolution API, saving the response to a file:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_audio_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"GROUP_OR_CHAT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_audio_req.json -o /tmp/wa_audio_response.json\n python3 - <<'PY'\n import json, base64\n d=json.load(open('/tmp/wa_audio_response.json'))\n b=d.get('base64') or d.get('data',{}).get('base64')\n if ',' in b[:100]: b=b.split(',',1)[1]\n open('/tmp/wa_audio.ogg','wb').write(base64.b64decode(b))\n PY\n ```\n3. **Convert OGG/Opus to WAV**:\n ```bash\n ffmpeg -y -i /tmp/wa_audio.ogg -ar 16000 -ac 1 /tmp/wa_audio.wav\n ```\n4. **Transcribe fallback when Whisper is unavailable**: the container may not have `whisper`, `faster-whisper`, `torch`, or `speech_recognition` installed globally. A reliable lightweight fallback is a temp venv with SpeechRecognition and Google web speech:\n ```bash\n python3 -m venv /tmp/srvenv\n /tmp/srvenv/bin/pip install --quiet SpeechRecognition\n /tmp/srvenv/bin/python - <<'PY'\n import speech_recognition as sr\n r=sr.Recognizer()\n with sr.AudioFile('/tmp/wa_audio.wav') as source:\n audio=r.record(source)\n print(r.recognize_google(audio, language='en-SG'))\n PY\n ```\n5. **Check adjacent voice notes from the same sender**. WhatsApp voice notes are often sent in quick sequential chunks. Download and transcribe nearby `audioMessage` records from the same sender before deciding whether to notify Vincent.\n6. Use the transcript plus recent context to classify the message. For group meetup planning, extract what/when/where/who and notify Vincent if actionable, including transcript confidence.\n\n### Video Messages\n\nWhen an incoming WhatsApp message is `videoMessage` (with or without a caption):\n\n1. **Fetch conversation context** first, as with any media message: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n2. **Download the video** using Evolution API (same pattern as images/audio — save response to file, then decode):\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_video_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"CONTACT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_video_req.json -o /tmp/wa_video_response.json\n python3 - <<'PY'\n import json, base64\n d = json.load(open('/tmp/wa_video_response.json'))\n b = d.get('base64') or d.get('data', {}).get('base64', '')\n if ',' in b[:100]: b = b.split(',', 1)[1]\n open('/tmp/wa_video.mp4', 'wb').write(base64.b64decode(b))\n print(f\"Video saved: {len(base64.b64decode(b))} bytes\")\n PY\n ```\n3. **Extract representative frames** with ffmpeg (do NOT try to analyze video directly — vision tools work on images):\n ```bash\n # Frame at start\n ffmpeg -y -ss 0 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame0.jpg\n # Frame at ~1/3 of the way (good for short videos)\n ffmpeg -y -ss 3 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame3s.jpg\n # If video is longer (>15s), also extract at midpoint\n # ffprobe -v quiet -print_format json -show_format /tmp/wa_video.mp4 # get duration\n # ffmpeg -y -ss $HALF_DURATION -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame_mid.jpg\n ```\n4. **Analyze the frames** with `vision_analyze`. Ask specific questions about content: what objects/people/text are visible, what type of video this appears to be (screen recording, personal clip, forwarded meme, etc.), and any text or identifying details.\n5. **Use the caption** from `videoMessage.caption` if present (can be empty/null). Combine frame analysis + caption + conversation context to classify the message intent.\n6. **Transcription of audio in video** is generally not feasible with available tools. If the video likely contains important spoken content (e.g. voice memo style), note this limitation and flag it to Vincent.\n\n**Key points:**\n- Always use `-o` flag and file-based decode; videos can be 1-10MB and base64 responses will be even larger\n- `--max-time 25` on curl is essential for video downloads\n- Short casual clips (showing a key fob, a parking spot, quick view of surroundings) are usually casual/social — classify accordingly and don't over-escalate\n- If `vision_analyze` fails (400 on models that don't support images, or 429 rate limit), fall back to conversation context inference alone\n\n### Poll Messages\nWhen `pollUpdateMessage` appears in chat history (someone voted), the message contains a `pollCreationMessageKey` with the original poll's message `id`. To get the poll question and options, query:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"id\": \"POLL_MESSAGE_ID\", \"remoteJid\": \"GROUP_JID\"}}, \"limit\": 1}'\n```\nThe response will have `pollCreationMessageV3` with `name` (the poll question) and `options` array. Always look up the poll when votes or poll references appear, since the poll text contains the event details.\n\n### `associatedChildMessage` (Motion Photos / Companion Media)\nWhatsApp motion photos and companion media arrive as `associatedChildMessage` type. These are short videos or supplementary images attached to a parent message via `messageAssociation.parentMessageKey`. The structure is:\n\n```\nmessage: {\n messageContextInfo: {\n messageAssociation: {\n associationType: 12,\n parentMessageKey: { id, fromMe, remoteJid }\n }\n },\n associatedChildMessage: {\n message: {\n videoMessage: { ... } // or imageMessage: { ... }\n }\n }\n}\n```\n\nKey points:\n- The `parentMessageKey.id` links back to the main message (the image/video they accompany)\n- When `parentMessageKey.fromMe: true`, the companion media is replying to a message Vincent sent\n- Captions may appear on the child's `videoMessage.caption` or `imageMessage.caption` inside the nested message, NOT at the top level\n- These often contain motion-photo short videos (3 seconds) or album companion images\n- Treat them as media attachments to the parent; don't process them as standalone messages requiring independent intent analysis\n\n### Sequential Image Sets (Non-Album)\nMultiple images sent rapidly by the same sender at similar timestamps may NOT be wrapped in an `albumMessage`. They appear as separate `imageMessage` entries. Always check for adjacent images from the same sender and download+OCR all of them together, as they often form a coherent multi-page set (e.g., route info, comparison tables, maps). Missing one can lose important context.\n\n### Key pitfalls\n- The Evolution API `getBase64FromMediaMessage` endpoint returns the full base64 payload (can be 100KB+ for images). Use `-o` flag to save to file, then parse with Python. Do NOT try to pipe it directly\n- `curl` with `--max-time 25` is essential; without it, large media responses can hang\n- The `albumMessage` type itself has no `fromMe` field; check each individual `imageMessage` instead\n- Image thumbnails in the webhook payload are truncated; always download the full image for analysis\n- **WhatsApp media URLs expire**: CDN download links (`mmg.whatsapp.net`) expire after roughly 24-48 hours. When `getBase64FromMediaMessage` returns a 400/404 error like \"Failed to fetch stream from https://mmg.whatsapp.net/...\", the media is no longer retrievable via the API. This means you CANNOT download adjacent/older voice notes, images, or documents for context if they're more than a day or two old. Fall back to conversation text context only. Fresh/incoming messages are almost always still downloadable.\n- **Voice note transcription reliability**: Short voice notes (<15s), especially in Singlish/Chinese-English mix common in Singapore contacts, often produce unreliable transcriptions via SpeechRecognition+Google. Always flag low-confidence transcriptions to Vincent rather than acting on potentially garbled text.\n- **Vision API failures**: `vision_analyze` can fail with 429 (rate limit) **or 400** (model doesn't support image input, e.g. glm-5.1 via ollama-cloud). Always have a fallback plan. When vision is unavailable, try tesseract OCR, then pixel analysis (PIL/numpy for brightness zones, text region detection via row-level dark-pixel density)\n- **OCR can fail on decorative/complex images**: Obituaries, memorial notices, religious flyers with ornate fonts, or dark-background images often defeat tesseract even with aggressive preprocessing (upscaling, thresholding, inverting, PSM mode variations). When OCR returns gibberish or empty, fall back to conversation context inference: use preceding/surrounding messages to determine what the image likely contains, then draft an appropriate response based on that context\n- **OCR on devotional/dark-background images**: These often defeat tesseract on full-image scans. Use a multi-strategy approach: (1) run tesseract on the full image first, (2) if text is sparse, crop into regions (top 15%, middle, bottom 15%), upscale 2-3x, enhance contrast (2-3x), and try both normal and inverted (`ImageOps.invert`) versions with `--psm 6`, (3) combine all partial OCR results. For Exodus/verse images, the title (e.g. \"WHEN MOSES' HANDS WERE\") is often in decorative font that OCR misses, while the verse body text is usually readable.\n- **Prioritize conversation context**: When processing group images, always fetch recent context FIRST (before attempting vision/OCR). If image analysis fails, the conversation history often makes the image's purpose clear enough to draft a meaningful reply\n- **DM conversation context is one-sided (fromMe only)**: The Evolution API `findMessages` endpoint often returns only Vincent's sent messages (`fromMe: true`) for DM chats, not the contact's incoming messages. This means `wa_recent_messages.py` can show only one side of the conversation. When processing a DM webhook, do NOT rely solely on fetched history for the contact's side. The webhook payload already contains the incoming message text. Use what the webhook gives you plus Vincent's side from the API. If you need the contact's recent messages, check if `fromMe: false` records exist; if none are returned, the API simply doesn't have them and you must work with what you have (webhook message + Vincent's replies for context).\n- **`findMessages` ignores `key.fromMe` filter**: Passing `\"key.fromMe\": false` in the `where` clause still returns all messages (including `fromMe: true`). To find messages from a specific sender, fetch results and filter client-side by checking `fromMe` in Python.\n- **`findMessages` caps at ~50 records per request**: Even with `limit: 200` or `limit: 500`, the API returns at most 50 records. When one party sends many consecutive messages, the other party's messages may not appear in the first 50 results. Use pagination or fetch multiple pages if you need older messages.\n- **Video frame extraction**: Use `ffmpeg -y -ss N -i video.mp4 -frames:v 1 -update 1 output.jpg` (the `-update 1` flag is needed; without it ffmpeg treats the output as an image sequence and fails). Extract frames at multiple timestamps (start, ~1/3, midpoint for longer videos) since a single frame may miss the key content.\n- **ffmpeg \"Invalid color space\" failure on WhatsApp videos**: Some WhatsApp H.264 videos have `color_space=reserved` and `color_primaries=reserved` metadata that ffmpeg cannot decode at all. Every codec/format/filter combination fails with \"Invalid color space\" / return code 234. This is NOT fixable with `-vf setparams`, `-color_trc`, `-colorspace`, `-color_primaries`, or `-vf format=` flags because the error occurs at the decoder level before any filter runs. **Do not waste time trying ffmpeg workarounds.** When this happens, fall back to: (1) conversation context inference (who sent it, what thread it's in, adjacent messages), (2) the jpegThumbnail embedded in the videoMessage (small but gives a rough idea; base64-encoded JPEG, typically 600-900 bytes, use `vision_analyze` on it if available), or (3) if all analysis fails, just classify based on surrounding text context.\n- **`wa_recent_messages.py` JSON output breaks `json.loads()`**: The script's stdout contains WhatsApp message content with control characters (emoji, CJK, deviceListMetadata binary blobs) that cause `json.loads()` to fail with `Invalid control character` or `Expecting ',' delimiter` errors. Even `strict=False` does not reliably fix it. **Reliable approach**: use `curl -o /tmp/wa_msgs.json` to save the Evolution API response directly to a file (the file write handles binary safely), then parse with `json.load(open('/tmp/wa_msgs.json'))` in Python. Example:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 15 -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"CONTACT_JID\"}}, \"limit\": 20}' -o /tmp/wa_msgs.json\n ```\n Then in Python: `data = json.load(open('/tmp/wa_msgs.json')); records = data['messages']['records']`\n\n## API Reference\n\nFor full API endpoints (media, contacts, location, buttons) — see [references/api.md](references/api.md).\n\n[Skill directory: /opt/data/skills/personal/whatsapp-evolution]\nResolve any relative paths in this skill (e.g. `scripts/foo.js`, `templates/config.yaml`) against that directory, then run them with the terminal tool using the absolute path.\n\n[This skill has supporting files:]\n- references/api.md -> /opt/data/skills/personal/whatsapp-evolution/references/api.md\n- scripts/process_queue.py -> /opt/data/skills/personal/whatsapp-evolution/scripts/process_queue.py\n- scripts/send_text.sh -> /opt/data/skills/personal/whatsapp-evolution/scripts/send_text.sh\n\nLoad any of these with skill_view(name=\"personal/whatsapp-evolution\", file_path=\"\"), or run scripts directly by absolute path (e.g. `node /opt/data/skills/personal/whatsapp-evolution/scripts/foo.js`).\n\nThe user has provided the following instruction alongside the skill invocation: Incoming WhatsApp webhook received.\n\nEvent: messages.upsert\nInstance: n8n-WA\nSender: 神的华美\nSender JID: 6591069539@s.whatsapp.net\nFrom me: False\nMessage type: conversation\nMessage: {\n \"event\": \"messages.upsert\",\n \"instance\": \"n8n-WA\",\n \"data\": {\n \"key\": {\n \"remoteJid\": \"6591069539@s.whatsapp.net\",\n \"remoteJidAlt\": \"6591069539@s.whatsapp.net\",\n \"fromMe\": false,\n \"id\": \"AC97947F19E0EBD08AD5A776C5B26A58\",\n \"participant\": \"\",\n \"addressingMode\": \"lid\"\n },\n \"pushName\": \"\\u795e\\u7684\\u534e\\u7f8e\",\n \"status\": \"DELIVERY_ACK\",\n \"message\": {\n \"messageContextInfo\": {\n \"threadId\": [],\n \"deviceListMetadata\": {\n \"senderKeyIndexes\": [],\n \"recipientKeyIndexes\": [],\n \"recipientKeyHash\": {\n \"0\": 178,\n \"1\": 98,\n \"2\": 210,\n \"3\": 215,\n \"4\": 214,\n \"5\": 115,\n \"6\": 141,\n \"7\": 49,\n \"8\": 137,\n \"9\": 97\n },\n \"recipientTimestamp\": {\n \"low\": 1779868286,\n \"high\": 0,\n \"unsigned\": true\n }\n },\n \"deviceListMetadataVersion\": 2,\n \"messageSecret\": {\n \"0\": 248,\n \"1\": 159,\n \"2\": 95,\n \"3\": 11,\n \"4\": 218,\n \"5\": 122,\n \"6\": 28,\n \"7\": 48,\n \"8\": 158,\n \"9\": 248,\n \"10\": 66,\n \"11\": 212,\n \"12\": 57,\n \"13\": 110,\n \"14\": 29,\n \"15\": 66,\n \"16\": 25,\n \"17\": 101,\n \"18\": 23,\n \"19\": 114,\n \"20\": 162,\n \"21\": 26,\n \"22\": 224,\n \"23\": 136,\n \"24\": 217,\n \"25\": 11,\n \"26\": 112,\n \"27\": 199,\n \"28\": 246,\n \"29\": 144,\n \"30\": 186,\n \"31\": 103\n }\n },\n \"conversation\": \"https://vt.tiktok.com/ZSQ2ktTph/\"\n },\n \"contextInfo\": null,\n \"messageType\": \"conversation\",\n \"messageTimestamp\": 1780815928,\n \"instanceId\": \"b2bfaa30-d5f1-4b22-b979-791892619414\",\n \"source\": \"android\"\n },\n \"destination\": \"https://jimmy.huper.asia/webhooks/whatsapp-incoming\",\n \"date_time\": \"2026-06-07T04:05:28.853Z\",\n \"sender\": \"6590054438@s.whatsapp.net\",\n \"server_url\": \"https://evolution.huper.asia\",\n \"apikey\": \"5E4C532223A0-460B-A749-16C584F56E74\"\n}\n\nPrimary goal: REDUCE NOISE. Stay silent in the main Telegram chat unless the message truly needs Vincent's immediate attention. Approval-worthy reply drafts should go to the jimmy-wa-drafts sidecar, NOT to the main Telegram chat.\n\nFollow the whatsapp-evolution skill and /opt/data/agent-state/memory/whatsapp-vip-contacts.json, with these stricter notification rules:\n\nABSOLUTE SILENCE RULES:\n- If fromMe is true: ignore. Do NOT call send_message. Final response local log only.\n- Ignore status/update/read-receipt events or payloads with no real message content.\n- Do NOT send main Telegram summaries for ordinary DMs, routine logistics, casual family messages, generic group events, group chatter, forwards, memes, devotional/religious shares, links, standalone codes, normal confirmations, or suggested replies.\n- Do NOT ask \"Want me to add this to Calendar?\" for WhatsApp group events unless Vincent is explicitly invited, named, responsible, or the event is time-sensitive and clearly relevant to him today.\n- Do NOT surface family messages just because they are family. Only surface if urgent, safety-related, school/pickup critical, money/legal/medical, or explicitly requires Vincent now.\n\nSIDEcar RULES:\n- If a reply may be useful but it is not urgent, POST it to https://jimmy-draft.huper.asia/drafts using the whatsapp-evolution Draft Approvals flow. Do NOT duplicate that draft/summary in the main Telegram chat.\n- Prayer and birthday replies should go to the sidecar when appropriate, not the main Telegram chat, unless the underlying situation is urgent/sensitive and Vincent needs immediate awareness.\n- Before drafting any reply or approval suggestion, fetch recent context: python3 /opt/data/scripts/wa_recent_messages.py \"6591069539@s.whatsapp.net\" 30. For active/ambiguous groups use 50.\n\nMAIN TELEGRAM ALERTS ARE ALLOWED ONLY FOR:\n1. VIP contacts marked always_call or always_notify when the message is important/actionable, or any VIP call event.\n2. Calls from family/VIP that appear missed/ringing and likely need immediate attention. Avoid repeated alerts for accepted/terminated duplicate call state updates.\n3. Emergencies, safety issues, medical issues, school pickup/child logistics requiring action now, legal/money/account/security matters, or time-sensitive commitments in the next few hours.\n4. Messages explicitly asking Vincent to call back, decide now, approve payment, provide OTP/code for an active task, or respond urgently.\n5. Pipeline errors or webhook processing failures.\n\nCALL RULES:\n- VIP contacts (e.g. Dn Jeff, Ng Whye Keong): call Vincent immediately outside quiet hours when actionably important or on ringing/missed call. During 1am-10am SGT, Telegram only unless life-and-death emergency.\n- Do not call for casual VIP confirmations or non-actionable courtesy messages.\n\nTONE/RULES FOR DRAFTS:\n- Proposed replies must be Christian and grace-based where appropriate: warm, humble, encouraging, non-judgmental, peace-making, not preachy.\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- NEVER auto-reply directly on WhatsApp unless Vincent explicitly approves or future config marks that exact category pre-approved.\n\nDelivery rule:\n- For allowed main alerts only: call send_message(target=\"telegram\", message=...). Keep it concise.\n- For non-urgent reply suggestions: use the sidecar only.\n- For everything else: DO NOT call send_message. Keep final response as local-only log.\n", "timestamp": 1780815937.0646276, "anchor": true}, {"id": 104150, "role": "assistant", "content": "**Incoming WhatsApp DM — classified as casual/social link share**\n\n| Field | Detail |\n|---|---|\n| **Contact** | 神的华美 |\n| **JID** | 6591069539@s.whatsapp.net |\n| **Message** | `https://vt.tiktok.com/ZSQ2ktTph/` |\n| **Type** | TikTok link, no text, no question |\n| **Intent** | Casual/social — forwarded link |\n| **Action** | Local log only |\n\nNo Telegram notification sent. A bare TikTok link with no accompanying question or context is casual/social content — not urgent, not actionable, and not something Vincent needs to be alerted about. No reply drafted.", "timestamp": 1780815937.1456609}], "bookend_end": [], "messages_before": 0, "messages_after": 2}, {"session_id": "20260607_070357_b6d52f43", "when": "June 07, 2026 at 07:03 AM", "source": "webhook", "model": "glm-5.1", "title": null, "matched_role": "user", "match_message_id": 104146, "snippet": "...true`, they are replying to a >>>message<<< Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an >>>incoming<<< WhatsApp >>>message<<< is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification...", "bookend_start": [], "messages": [{"id": 104146, "role": "user", "content": "[IMPORTANT: The user has invoked the \"whatsapp-evolution\" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]\n\n---\nname: whatsapp-evolution\ndescription: WhatsApp messaging via Evolution API — send, receive, analyze, and auto-reply. Use when processing an incoming WhatsApp message delivered via the webhook, replying on Vincent's behalf, liaising with contacts, sending messages, or handling WhatsApp-related tasks. Covers intent analysis, auto-reply rules, escalation (VIP → phone call), conversation context fetching, and draft approvals via the jimmy-wa-drafts sidecar.\n---\n\n# WhatsApp via Evolution API\n\n## Credentials\nStored in `/opt/data/secrets/.whatsapp-evolution-credentials`:\n- `EVOLUTION_API_URL`, `EVOLUTION_API_KEY`, `EVOLUTION_INSTANCE`\n\nLoad with: `export $(cat .whatsapp-evolution-credentials | xargs)`\n\n## Sending Messages (direct, no approval)\n\n```bash\nexport $(cat .whatsapp-evolution-credentials | xargs)\ncurl -s -X POST \"${EVOLUTION_API_URL}/message/sendText/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"number\": \"6591234567\", \"text\": \"Hello!\"}'\n```\n\n- Phone format: country code, no `+` (e.g. `6590054438`)\n- Groups: use JID `{groupid}@g.us`\n- For media/location/contacts — see [references/api.md](references/api.md)\n\nUse direct send only when Vincent has explicitly commanded/approved the exact send, or for pre-approved auto-replies/follow-ups. For any reply that needs Vincent's review, use the sidecar (next section).\n\n### Direct replies when Vincent names an existing WhatsApp thread\n\nIf Vincent says to “reply to X” or “confirm with X” and has clearly approved the action, but the contact JID/number is not in the current prompt:\n\n1. Search recent session history first (`session_search`) for the contact name, clinic/business name, or appointment details. Prior webhook summaries often contain the exact `remoteJid`, alternate phone JID, sender name, and context.\n2. If session history identifies the WhatsApp phone JID, send directly only if Vincent’s instruction is explicit enough to act. Keep the text concise and natural.\n3. If only a LID JID is found, look for `remoteJidAlt`/alternate phone JID in the session summary or fetch contacts/chats before sending.\n4. Do not invent phone numbers from web search when replying to an existing WhatsApp conversation. Use web search only as supporting context, not as the destination, unless Vincent explicitly asks to contact a public number.\n\n\n### Sending trip/camp details from retrieved emails\n\nWhen Vincent explicitly asks to send extracted school trip/camp details to a child or family member, send directly. Keep it practical and grounded:\n- Include dates, reporting/pick-up times, flights, venue/itinerary highlights, deadlines, forms, and safety/medication notes when found.\n- If an official packing list is missing, say that clearly. Do not invent one. You may add a short “based on itinerary/weather, prepare…” list, clearly labeled as practical guidance rather than official school instructions.\n- Avoid long explanations about the search process in WhatsApp. Send the useful details.\n\n### Sending a local image/file as media\n\nFor user-approved media sends, base64-encode the local file and call `sendMedia`:\n\n```bash\nexport $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\npython3 - <<'PY'\nimport base64, json\npath = '/tmp/file.png'\npayload = {\n 'number': '6591234567',\n 'mediatype': 'image',\n 'mimetype': 'image/png',\n 'caption': 'Caption here',\n 'media': base64.b64encode(open(path, 'rb').read()).decode(),\n 'fileName': 'file.png',\n}\nopen('/tmp/wa_media_payload.json', 'w').write(json.dumps(payload))\nPY\ncurl -s --max-time 60 -X POST \"${EVOLUTION_API_URL}/message/sendMedia/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_media_payload.json\n```\n\nWhen sending QR tickets or passes, verify the crop/image contains the full QR code and the intended attendee name before sending; the QR screenshot is effectively the redeemable ticket.\n\n## Fetching Conversation Context\n\nBefore drafting or proposing any reply, always fetch recent chat history for context:\n\n```bash\npython3 /opt/data/scripts/wa_recent_messages.py \"6591234567@s.whatsapp.net\" 30\n```\n\nUse `limit=30` by default. For active group chats or ambiguous situations, fetch up to 50 messages. Read enough context to understand who is being discussed, current tone, previous commitments, and whether Vincent has already replied. The script returns newest messages first (descending order).\n\n## Draft Approvals — `jimmy-wa-drafts` sidecar\n\nWhen you need Vincent to approve a WhatsApp reply, POST the draft to the sidecar. It owns the entire approval flow: Telegram message with 4 inline buttons, Telegram WebApp for Edit/Context, sending to WhatsApp via Evolution API on Approve.\n\n### Posting a draft\n\nAlways include an `extras` object with the original WA context. The sidecar passes `extras` through verbatim if Vincent later clicks \"Context\" — without it, your redraft will be flying blind.\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\n \"contact\": \"120363351787895353@g.us\",\n \"contact_name\": \"Cycling group\",\n \"summary\": \"Karen sent a blessing over the bike ride; she's not joining\",\n \"draft_text\": \"Thanks Karen, will miss you both — praying for a good time too 🙏\",\n \"extras\": {\n \"group_jid\": \"120363351787895353@g.us\",\n \"sender_name\": \"Karen Tan\",\n \"sender_jid\": \"6597630553@s.whatsapp.net\",\n \"category\": \"prayer/blessing\",\n \"original_message\": \"Happy cycling, to those who can make it… Dn Jeff and I had made plans for this PM so we won't be able to join y'all. Praying the Lord's favor, protection and good-weather blessings be upon y'all and your route!\",\n \"extracted\": {\n \"what\": \"Cycling from East Coast Park to Changi Village, then dinner\",\n \"when\": \"Today, 4:00 PM\",\n \"where\": \"East Coast Park to Changi Village\",\n \"who\": \"Group members who voted yes for cycling/dinner or dinner\"\n },\n \"confidence\": \"high\",\n \"why_appropriate\": \"Multiple group messages are blessings/wishes; Vincent has not replied; tone matches grace-based norm.\",\n \"wa_message_id\": \"ABC123XYZ\"\n }\n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\nResponse: `{\"draft_id\":\"abc123def\",\"telegram_message_id\":42}`\n\nWhat to put in `extras` (use whatever makes sense for the message type):\n\n- **Always**: `original_message` (verbatim WA text), `sender_name`, `category` (e.g. `prayer`, `birthday`, `event`, `dm-reply`, `enquiry`)\n- **Group messages**: `group_jid`, `sender_jid`\n- **DMs**: `sender_jid`\n- **Strongly recommended (drives the rich Telegram render)**: `extracted` (dict of who/what/when/where), `confidence` (`high`/`medium`/`low`), `why_appropriate` (one-line reasoning shown to Vincent so he can decide)\n- **Optional**: `wa_message_id`, `tone_hint` (e.g. `warm-grace-based`), `sender_phone` (auto-derived from `sender_jid` if omitted)\n\nVincent receives a Telegram message in the `@jimmy_WA_bot` chat with 4 inline buttons. The sidecar handles every interaction; **you do not track replies, callbacks, or message IDs**. Just POST and move on.\n\n| Button | Behavior |\n|--------|----------|\n| **Approve** | Sidecar sends draft to WhatsApp |\n| **Edit** | Opens a Telegram WebApp with draft pre-filled; Vincent edits + sends; sidecar pushes to WhatsApp |\n| **Context** | Opens a Telegram WebApp with empty textarea + the original WA message visible; Vincent provides context. (When `HERMES_REDRAFT_WEBHOOK` is wired in the sidecar's env, this will POST back to Jimmy for a redraft. Until then, the context is just stored on the draft.) |\n| **Pass** | Sidecar deletes the draft |\n\n### JSON escaping when `draft_text` contains quotes\n\nSingle quotes (`It's`, `God's`) break shell interpolation. Write the JSON to a temp file first:\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\"contact\":\"6591234567@s.whatsapp.net\",\"contact_name\":\"John\",\"summary\":\"…\",\"draft_text\":\"It's nice to hear from you!\"}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\n### Networking\n\n`https://jimmy-draft.huper.asia` is the public HTTPS endpoint of the sidecar (TLS-terminated by Traefik on ubuntu-server). HTTPS is required because Hermes blocks plain-HTTP commands via tirith. The internal Docker DNS hostname http://jimmy-wa-v64mqe:8080 also works for non-shell calls but the agent is forbidden from using it.\n\n### Sidecar status & logs\n\n- Source: (auto-deploys on push)\n- Public URL: (`/health` for ping)\n- Container: `jimmy-wa-v64mqe.1.` on ubuntu-server\n- Bot: `@jimmy_WA_bot`\n\n## Handling a Redraft Trigger (`wa-redraft` webhook)\n\nWhen Vincent clicks the **Context** button on a draft, the sidecar:\n1. Edits the original Telegram message to \"🔄 Adding context for X… agent is redrafting.\" (removes the old buttons so Vincent can't approve the now-stale text)\n2. POSTs to your `wa-redraft` webhook (`http://localhost:8644/webhooks/wa-redraft`) with the original draft fields plus Vincent's `context` instructions and the `extras` you stored when you first created the draft.\n\nWhen you receive a `wa-redraft` activation, the prompt template will lay out the fields. Your job:\n\n1. Read Vincent's `context` carefully — that's the steering signal. The `previous_draft` is just history, not necessarily a starting point.\n2. Re-fetch recent WA history if the original context isn't enough (`python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`). Use `extras.group_jid` for groups or `contact` for DMs.\n3. Write a NEW draft that addresses Vincent's instructions while staying faithful to the original WA context (in `extras.original_message`, `extras.extracted`).\n4. POST the new draft back to the sidecar **with the same `draft_id`** so the existing draft is updated, not duplicated:\n\n```bash\ncat > /tmp/wa_redraft.json <\",\n \"contact\": \"\",\n \"contact_name\": \"\",\n \"summary\": \"\",\n \"draft_text\": \"\",\n \"extras\": \n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_redraft.json\n```\n\nThe sidecar will send a NEW Telegram message with 4 fresh buttons. Vincent can Approve, Edit, Context-again (loop), or Pass.\n\nDo **not** send to WhatsApp directly during a redraft — the approval flow is the sidecar's job.\n\n## Incoming Message Processing\n\n### Delivery Pipeline (current)\n1. Evolution API POSTs each `MESSAGES_UPSERT` webhook directly to Hermes at `http://localhost:8644/webhooks/whatsapp-incoming` (proxied via `jimmy.huper.asia/webhooks/whatsapp-incoming`)\n2. Hermes webhook route (`whatsapp-incoming`, dynamic subscription) validates and spawns an agent\n3. Agent loads `whatsapp-evolution` skill, follows the rules below\n4. Webhook subscription uses `deliver: log` because webhooks do not support `deliver: local`; the agent explicitly calls `send_message(target=\"telegram\", ...)` only for important/actionable/VIP/event/birthday/prayer/error items, OR posts a draft to the sidecar for approval items.\n\n**No more OpenClaw, no Python webhook server, no port 5001, no keepalive cron.** If messages aren't being processed, check: nginx proxy to port 8644, `hermes webhook list` for the subscription, Evolution API webhook URL setting.\n\n## Contact Rules & Escalation\n\nConfig: `memory/whatsapp-vip-contacts.json`\n\n### Priority Levels\n\n| Priority | Action | Example |\n|----------|--------|---------|\n| **VIP (call-always)** | **CALL Vincent immediately via Twilio without asking.** Do NOT ask for approval first. During quiet hours (1am-10am SGT) suppress call and use Telegram notification only, unless life-and-death emergency. Match VIPs by both phone JID (`659...@s.whatsapp.net`) and LID JID (`...@lid`) when present, because recent chats may arrive with `key.remoteJid` as an LID and the phone number only in `key.remoteJidAlt`. | Dn Jeff (+6597630553) |\n| **Family** | Always notify Vincent on Telegram | ps112 group, family members |\n| **Auto-reply** | Reply based on context file, notify Vincent | Repair shops |\n| **Normal** | Analyze intent, reply if appropriate, notify if important | Everyone else |\n| **Broadcast/Spam** | Ignore silently | Forwarded job posts, chain messages |\n\n### Quiet Hours\n- 1am-10am SGT (17:00-02:00 UTC): avoid phone calls, including for Dn Jeff / VIP call-always contacts\n- **Exception:** life-and-death emergencies always call regardless of quiet hours\n- During quiet hours, use Telegram notification only and mention that a call was suppressed due to quiet hours\n\n## Making a Call via Twilio\n\nCredentials: `/opt/data/secrets/.twilio-credentials-new` (`TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN`, `TWILIO_PHONE_NUMBER`). From: `+16614809287`, Vincent: `+6590054438`.\n\nPreferred script: `/opt/data/scripts/call_vincent.py`, now using the Twilio Python SDK baked into the container.\n\n```bash\npython3 /opt/data/scripts/call_vincent.py \"Hello Vincent, this is Jimmy. Urgent summary here.\"\n```\n\n## Action Rules\n\n### Default: DO NOT auto-reply directly, and do not spam main Telegram\nCollect intent and stay local unless something truly needs Vincent's immediate attention.\n\n**When you need Vincent to approve a reply, POST a draft to the sidecar (see \"Draft Approvals\" above).** The sidecar handles the Telegram notification with buttons, the WebApp for editing, and sending to WhatsApp on approve. Do **not** duplicate that draft or summary in the main Telegram chat.\n\nMain Telegram WhatsApp summaries are muted by default. Only use `send_message(target=\"telegram\", ...)` for urgent/important/actionable items that need Vincent now: VIP actionable messages/calls, missed/ringing family calls, emergencies, safety/medical/school-pickup issues, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors. Routine logistics, casual family pings, generic group events, links, standalone codes, normal confirmations, and suggested replies should be local-only or sidecar-only.\n\nDo NOT send WhatsApp replies directly unless Vincent explicitly approves them or the category is pre-approved.\n\n### Tone (when proposing or sending replies)\n- Casual, direct, no sign-offs (no \"Thanks & Regards\", no \"- Vincent\")\n- Ask questions instead of giving instructions\n- Keep it natural — like texting a friend\n- Christian and grace-based: warm, humble, encouraging, non-judgmental, peace-making\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- For prayer/birthday contexts, use gentle faith-filled language; avoid sounding preachy, performative, or overly formal\n\n### What to Do With Each Message\n\n**DMs (personal chats):**\n- Analyze intent\n- Notify Vincent on Telegram (via `send_message(target=\"telegram\", ...)`) only if important/actionable/time-sensitive, requires a decision, comes from a VIP contact\n- If you have a useful suggested reply, POST it as a draft to the sidecar instead of a Telegram notification — Vincent gets the buttons inline\n- Casual/social/non-actionable DMs → local log only, no Telegram notification, no draft\n- Do NOT reply unless Vincent says to\n- VIP contacts → **call Vincent immediately without asking**. Do NOT post a draft. Just call. During quiet hours (1am-10am SGT), suppress the call and send Telegram notification only, unless life-and-death emergency.\n\n**Group chats — events/meetups:**\n- If someone posts about an event, gathering, meetup, dinner, etc:\n 1. Extract: what, when, where, who\n 2. Notify Vincent on Telegram with details (no draft needed unless you also want to suggest an RSVP reply)\n 3. Ask: \"Want me to add this to Google Calendar and set a reminder?\"\n\n**Group chats — prayer requests / praying for others:**\n- If someone asks for prayer, shares a burden, illness, grief, struggle, thanksgiving, or asks the group to pray:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Identify who/what to pray for and any sensitive details\n 3. Draft a short, grace-based Christian reply/prayer\n 4. POST it to the sidecar with `summary` describing the request and `draft_text` being the prayer\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — birthday greetings:**\n- If you see birthday wishes for someone in a group:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Note who the birthday person is and relationship/context if inferable\n 3. Draft a warm, grace-based Christian birthday greeting\n 4. POST it to the sidecar\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — general:**\n- Casual banter, memes, forwards, broadcast spam → ignore silently\n- Religious content shares → ignore unless specifically addressed to Vincent\n\n### When to Notify Vincent (Telegram) vs Post a Draft\n\n**Use `send_message(target=\"telegram\", ...)` sparingly** for plain notifications only when the message needs Vincent's immediate attention: VIP actionable DMs/calls, emergencies, school pickup/child logistics requiring action now, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors.\n\n**Use the sidecar (POST `/drafts`)** whenever you have a suggested reply for Vincent to approve: prayer responses, birthday greetings, DM replies, group RSVPs. Do not also send the same summary to the main Telegram chat.\n\nDo **not** notify or draft for casual/social/non-actionable DMs, routine logistics, read receipts/status updates, religious shares not addressed to Vincent, memes, forwarded/broadcast/spam, generic group events, standalone codes, normal confirmations, or general group chatter. Family messages are not automatically main-chat alerts unless urgent, safety-related, school/pickup critical, or explicitly requiring Vincent now.\n\n## Intent Categories\n\nWhen analyzing incoming messages, classify as:\n\n1. **Urgent** — time-sensitive, emergencies → call Vincent outside quiet hours; during quiet hours use Telegram only unless life-and-death\n2. **Important** — decisions, money, appointments, anything consequential → call Vincent outside quiet hours; during quiet hours notify on Telegram only unless life-threatening\n3. **Request** — asking Vincent to do something → **notify Vincent** (Telegram) or post a draft if you have a safe reply\n4. **Enquiry** — asking for info, pricing, availability → auto-reply if context available, otherwise post a draft\n5. **Confirmation** — confirming plans, bookings → acknowledge and notify Vincent\n6. **Casual/Social** — greetings, memes, forwards → usually ignore or react\n7. **Broadcast/Spam** — forwarded job posts, chain messages → ignore\n\n## Finding Individual Contacts and JIDs\n\nWhen Vincent asks to monitor or prioritize a specific WhatsApp contact and only gives a name:\n\n1. Query contacts and chats, then search both display names and serialized message metadata:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n python3 - <<'PY'\n import os, requests, json\n base=os.environ['EVOLUTION_API_URL']; inst=os.environ['EVOLUTION_INSTANCE']; key=os.environ['EVOLUTION_API_KEY']\n headers={'apikey':key,'Content-Type':'application/json'}\n terms=['whye','keong'] # replace with lowercase search terms\n for ep in ['findContacts','findChats']:\n r=requests.post(f'{base}/chat/{ep}/{inst}', headers=headers, json={}, timeout=30)\n print('endpoint', ep, 'status', r.status_code)\n for obj in (r.json() if r.ok else []):\n if any(t in json.dumps(obj, ensure_ascii=False).lower() for t in terms):\n print(json.dumps(obj, ensure_ascii=False)[:1000])\n PY\n ```\n2. If `findChats` shows `remoteJid` ending in `@lid`, inspect `lastMessage.key.remoteJidAlt` for the phone JID. Store both in `memory/whatsapp-vip-contacts.json` when configuring VIP/call rules.\n3. Verify the number exists with `/chat/whatsappNumbers/{instance}` before finalizing.\n4. Validate the JSON config after editing: `python3 -m json.tool /opt/data/agent-state/memory/whatsapp-vip-contacts.json >/tmp/wa_vip_validated.json`.\n\n## Finding Group JIDs\n\nThe `group/fetchGroups` endpoint returns 404 on some Evolution API instances (including ours). To find a group JID when you don't have it:\n\n1. **Use `chat/findMessages` with an empty `where` filter** to pull recent messages across all chats, then extract unique `remoteJid` values containing `@g.us`.\n2. **Match by content**: Fetch messages from each group JID using `where.key.remoteJid` and match sender names or message text to identify the target group.\n3. **Known group JIDs** (store in `whatsapp-vip-contacts.json` as you discover them):\n - ps112 family group: `6591887488-1464061170@g.us`\n - Overcomers group: `6591444911-1614821287@g.us`\n - 神的华美 group: `6596756578-1405873840@g.us` (content-sharing, casual/broadcast, local log only)\n - Overcomers/TCians group: `120363230236878544@g.us` (church fellowship, PJ sends weekly attendance reports, routine attendance = local log only)\n\nExample search:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"6591444911-1614821287@g.us\"}}, \"limit\": 10, \"order\": \"desc\"}'\n```\n\n## Processing Image/Media Messages\n\nWhen an incoming WhatsApp message is an image (imageMessage), video, or other media with no text or only a caption:\n\n1. **Fetch conversation context** as usual with `wa_recent_messages.py`\n2. **Download the media** using the Evolution API:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"message\": {\"key\": {\"id\": \"MESSAGE_ID\", \"fromMe\": false, \"remoteJid\": \"GROUP_OR_CHAT_JID\"}}, \"convertToMp4\": false}' \\\n -o /tmp/wa_media_response.json\n ```\n Then decode: `python3 -c \"import json,base64; d=json.load(open('/tmp/wa_media_response.json')); open('/tmp/wa_media.jpg','wb').write(base64.b64decode(d['base64']))\"`\n3. **Analyze the image** using `vision_analyze` tool with `/tmp/wa_media.jpg` path and a context-aware question\n4. **Fallback if vision is unavailable**: Use `tesseract /tmp/wa_media.jpg stdout` for OCR, combined with pixel analysis (PIL/numpy) and conversation context to infer what the image contains\n5. **Album messages** (`albumMessage`): These bundle multiple images. Each image arrives as a separate `imageMessage` with a `messageAssociation.parentMessageKey` linking them to the album. Download each image separately. The `expectedImageCount` field tells how many images to expect\n6. **messageAssociation**: When images have `parentMessageKey` with `fromMe: true`, they are replying to a message Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an incoming WhatsApp message is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification when possible:\n\n1. **Fetch conversation context first**, especially in groups:\n ```bash\n python3 /opt/data/scripts/wa_recent_messages.py \"\" 50\n ```\n2. **Download the audio** with Evolution API, saving the response to a file:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_audio_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"GROUP_OR_CHAT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_audio_req.json -o /tmp/wa_audio_response.json\n python3 - <<'PY'\n import json, base64\n d=json.load(open('/tmp/wa_audio_response.json'))\n b=d.get('base64') or d.get('data',{}).get('base64')\n if ',' in b[:100]: b=b.split(',',1)[1]\n open('/tmp/wa_audio.ogg','wb').write(base64.b64decode(b))\n PY\n ```\n3. **Convert OGG/Opus to WAV**:\n ```bash\n ffmpeg -y -i /tmp/wa_audio.ogg -ar 16000 -ac 1 /tmp/wa_audio.wav\n ```\n4. **Transcribe fallback when Whisper is unavailable**: the container may not have `whisper`, `faster-whisper`, `torch`, or `speech_recognition` installed globally. A reliable lightweight fallback is a temp venv with SpeechRecognition and Google web speech:\n ```bash\n python3 -m venv /tmp/srvenv\n /tmp/srvenv/bin/pip install --quiet SpeechRecognition\n /tmp/srvenv/bin/python - <<'PY'\n import speech_recognition as sr\n r=sr.Recognizer()\n with sr.AudioFile('/tmp/wa_audio.wav') as source:\n audio=r.record(source)\n print(r.recognize_google(audio, language='en-SG'))\n PY\n ```\n5. **Check adjacent voice notes from the same sender**. WhatsApp voice notes are often sent in quick sequential chunks. Download and transcribe nearby `audioMessage` records from the same sender before deciding whether to notify Vincent.\n6. Use the transcript plus recent context to classify the message. For group meetup planning, extract what/when/where/who and notify Vincent if actionable, including transcript confidence.\n\n### Video Messages\n\nWhen an incoming WhatsApp message is `videoMessage` (with or without a caption):\n\n1. **Fetch conversation context** first, as with any media message: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n2. **Download the video** using Evolution API (same pattern as images/audio — save response to file, then decode):\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_video_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"CONTACT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_video_req.json -o /tmp/wa_video_response.json\n python3 - <<'PY'\n import json, base64\n d = json.load(open('/tmp/wa_video_response.json'))\n b = d.get('base64') or d.get('data', {}).get('base64', '')\n if ',' in b[:100]: b = b.split(',', 1)[1]\n open('/tmp/wa_video.mp4', 'wb').write(base64.b64decode(b))\n print(f\"Video saved: {len(base64.b64decode(b))} bytes\")\n PY\n ```\n3. **Extract representative frames** with ffmpeg (do NOT try to analyze video directly — vision tools work on images):\n ```bash\n # Frame at start\n ffmpeg -y -ss 0 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame0.jpg\n # Frame at ~1/3 of the way (good for short videos)\n ffmpeg -y -ss 3 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame3s.jpg\n # If video is longer (>15s), also extract at midpoint\n # ffprobe -v quiet -print_format json -show_format /tmp/wa_video.mp4 # get duration\n # ffmpeg -y -ss $HALF_DURATION -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame_mid.jpg\n ```\n4. **Analyze the frames** with `vision_analyze`. Ask specific questions about content: what objects/people/text are visible, what type of video this appears to be (screen recording, personal clip, forwarded meme, etc.), and any text or identifying details.\n5. **Use the caption** from `videoMessage.caption` if present (can be empty/null). Combine frame analysis + caption + conversation context to classify the message intent.\n6. **Transcription of audio in video** is generally not feasible with available tools. If the video likely contains important spoken content (e.g. voice memo style), note this limitation and flag it to Vincent.\n\n**Key points:**\n- Always use `-o` flag and file-based decode; videos can be 1-10MB and base64 responses will be even larger\n- `--max-time 25` on curl is essential for video downloads\n- Short casual clips (showing a key fob, a parking spot, quick view of surroundings) are usually casual/social — classify accordingly and don't over-escalate\n- If `vision_analyze` fails (400 on models that don't support images, or 429 rate limit), fall back to conversation context inference alone\n\n### Poll Messages\nWhen `pollUpdateMessage` appears in chat history (someone voted), the message contains a `pollCreationMessageKey` with the original poll's message `id`. To get the poll question and options, query:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"id\": \"POLL_MESSAGE_ID\", \"remoteJid\": \"GROUP_JID\"}}, \"limit\": 1}'\n```\nThe response will have `pollCreationMessageV3` with `name` (the poll question) and `options` array. Always look up the poll when votes or poll references appear, since the poll text contains the event details.\n\n### `associatedChildMessage` (Motion Photos / Companion Media)\nWhatsApp motion photos and companion media arrive as `associatedChildMessage` type. These are short videos or supplementary images attached to a parent message via `messageAssociation.parentMessageKey`. The structure is:\n\n```\nmessage: {\n messageContextInfo: {\n messageAssociation: {\n associationType: 12,\n parentMessageKey: { id, fromMe, remoteJid }\n }\n },\n associatedChildMessage: {\n message: {\n videoMessage: { ... } // or imageMessage: { ... }\n }\n }\n}\n```\n\nKey points:\n- The `parentMessageKey.id` links back to the main message (the image/video they accompany)\n- When `parentMessageKey.fromMe: true`, the companion media is replying to a message Vincent sent\n- Captions may appear on the child's `videoMessage.caption` or `imageMessage.caption` inside the nested message, NOT at the top level\n- These often contain motion-photo short videos (3 seconds) or album companion images\n- Treat them as media attachments to the parent; don't process them as standalone messages requiring independent intent analysis\n\n### Sequential Image Sets (Non-Album)\nMultiple images sent rapidly by the same sender at similar timestamps may NOT be wrapped in an `albumMessage`. They appear as separate `imageMessage` entries. Always check for adjacent images from the same sender and download+OCR all of them together, as they often form a coherent multi-page set (e.g., route info, comparison tables, maps). Missing one can lose important context.\n\n### Key pitfalls\n- The Evolution API `getBase64FromMediaMessage` endpoint returns the full base64 payload (can be 100KB+ for images). Use `-o` flag to save to file, then parse with Python. Do NOT try to pipe it directly\n- `curl` with `--max-time 25` is essential; without it, large media responses can hang\n- The `albumMessage` type itself has no `fromMe` field; check each individual `imageMessage` instead\n- Image thumbnails in the webhook payload are truncated; always download the full image for analysis\n- **WhatsApp media URLs expire**: CDN download links (`mmg.whatsapp.net`) expire after roughly 24-48 hours. When `getBase64FromMediaMessage` returns a 400/404 error like \"Failed to fetch stream from https://mmg.whatsapp.net/...\", the media is no longer retrievable via the API. This means you CANNOT download adjacent/older voice notes, images, or documents for context if they're more than a day or two old. Fall back to conversation text context only. Fresh/incoming messages are almost always still downloadable.\n- **Voice note transcription reliability**: Short voice notes (<15s), especially in Singlish/Chinese-English mix common in Singapore contacts, often produce unreliable transcriptions via SpeechRecognition+Google. Always flag low-confidence transcriptions to Vincent rather than acting on potentially garbled text.\n- **Vision API failures**: `vision_analyze` can fail with 429 (rate limit) **or 400** (model doesn't support image input, e.g. glm-5.1 via ollama-cloud). Always have a fallback plan. When vision is unavailable, try tesseract OCR, then pixel analysis (PIL/numpy for brightness zones, text region detection via row-level dark-pixel density)\n- **OCR can fail on decorative/complex images**: Obituaries, memorial notices, religious flyers with ornate fonts, or dark-background images often defeat tesseract even with aggressive preprocessing (upscaling, thresholding, inverting, PSM mode variations). When OCR returns gibberish or empty, fall back to conversation context inference: use preceding/surrounding messages to determine what the image likely contains, then draft an appropriate response based on that context\n- **OCR on devotional/dark-background images**: These often defeat tesseract on full-image scans. Use a multi-strategy approach: (1) run tesseract on the full image first, (2) if text is sparse, crop into regions (top 15%, middle, bottom 15%), upscale 2-3x, enhance contrast (2-3x), and try both normal and inverted (`ImageOps.invert`) versions with `--psm 6`, (3) combine all partial OCR results. For Exodus/verse images, the title (e.g. \"WHEN MOSES' HANDS WERE\") is often in decorative font that OCR misses, while the verse body text is usually readable.\n- **Prioritize conversation context**: When processing group images, always fetch recent context FIRST (before attempting vision/OCR). If image analysis fails, the conversation history often makes the image's purpose clear enough to draft a meaningful reply\n- **DM conversation context is one-sided (fromMe only)**: The Evolution API `findMessages` endpoint often returns only Vincent's sent messages (`fromMe: true`) for DM chats, not the contact's incoming messages. This means `wa_recent_messages.py` can show only one side of the conversation. When processing a DM webhook, do NOT rely solely on fetched history for the contact's side. The webhook payload already contains the incoming message text. Use what the webhook gives you plus Vincent's side from the API. If you need the contact's recent messages, check if `fromMe: false` records exist; if none are returned, the API simply doesn't have them and you must work with what you have (webhook message + Vincent's replies for context).\n- **`findMessages` ignores `key.fromMe` filter**: Passing `\"key.fromMe\": false` in the `where` clause still returns all messages (including `fromMe: true`). To find messages from a specific sender, fetch results and filter client-side by checking `fromMe` in Python.\n- **`findMessages` caps at ~50 records per request**: Even with `limit: 200` or `limit: 500`, the API returns at most 50 records. When one party sends many consecutive messages, the other party's messages may not appear in the first 50 results. Use pagination or fetch multiple pages if you need older messages.\n- **Video frame extraction**: Use `ffmpeg -y -ss N -i video.mp4 -frames:v 1 -update 1 output.jpg` (the `-update 1` flag is needed; without it ffmpeg treats the output as an image sequence and fails). Extract frames at multiple timestamps (start, ~1/3, midpoint for longer videos) since a single frame may miss the key content.\n- **ffmpeg \"Invalid color space\" failure on WhatsApp videos**: Some WhatsApp H.264 videos have `color_space=reserved` and `color_primaries=reserved` metadata that ffmpeg cannot decode at all. Every codec/format/filter combination fails with \"Invalid color space\" / return code 234. This is NOT fixable with `-vf setparams`, `-color_trc`, `-colorspace`, `-color_primaries`, or `-vf format=` flags because the error occurs at the decoder level before any filter runs. **Do not waste time trying ffmpeg workarounds.** When this happens, fall back to: (1) conversation context inference (who sent it, what thread it's in, adjacent messages), (2) the jpegThumbnail embedded in the videoMessage (small but gives a rough idea; base64-encoded JPEG, typically 600-900 bytes, use `vision_analyze` on it if available), or (3) if all analysis fails, just classify based on surrounding text context.\n- **`wa_recent_messages.py` JSON output breaks `json.loads()`**: The script's stdout contains WhatsApp message content with control characters (emoji, CJK, deviceListMetadata binary blobs) that cause `json.loads()` to fail with `Invalid control character` or `Expecting ',' delimiter` errors. Even `strict=False` does not reliably fix it. **Reliable approach**: use `curl -o /tmp/wa_msgs.json` to save the Evolution API response directly to a file (the file write handles binary safely), then parse with `json.load(open('/tmp/wa_msgs.json'))` in Python. Example:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 15 -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"CONTACT_JID\"}}, \"limit\": 20}' -o /tmp/wa_msgs.json\n ```\n Then in Python: `data = json.load(open('/tmp/wa_msgs.json')); records = data['messages']['records']`\n\n## API Reference\n\nFor full API endpoints (media, contacts, location, buttons) — see [references/api.md](references/api.md).\n\n[Skill directory: /opt/data/skills/personal/whatsapp-evolution]\nResolve any relative paths in this skill (e.g. `scripts/foo.js`, `templates/config.yaml`) against that directory, then run them with the terminal tool using the absolute path.\n\n[This skill has supporting files:]\n- references/api.md -> /opt/data/skills/personal/whatsapp-evolution/references/api.md\n- scripts/process_queue.py -> /opt/data/skills/personal/whatsapp-evolution/scripts/process_queue.py\n- scripts/send_text.sh -> /opt/data/skills/personal/whatsapp-evolution/scripts/send_text.sh\n\nLoad any of these with skill_view(name=\"personal/whatsapp-evolution\", file_path=\"\"), or run scripts directly by absolute path (e.g. `node /opt/data/skills/personal/whatsapp-evolution/scripts/foo.js`).\n\nThe user has provided the following instruction alongside the skill invocation: Incoming WhatsApp webhook received.\n\nEvent: messages.upsert\nInstance: n8n-WA\nSender: Charissa\nSender JID: 120363418226875471@g.us\nFrom me: False\nMessage type: reactionMessage\nMessage: {\n \"event\": \"messages.upsert\",\n \"instance\": \"n8n-WA\",\n \"data\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": false,\n \"id\": \"AC1107DAF9ECFD6C8250093817509B73\",\n \"participant\": \"174448720224439@lid\",\n \"participantAlt\": \"6592208540@s.whatsapp.net\",\n \"addressingMode\": \"lid\"\n },\n \"pushName\": \"Charissa\",\n \"status\": \"DELIVERY_ACK\",\n \"message\": {\n \"senderKeyDistributionMessage\": {\n \"groupId\": \"120363418226875471@g.us\",\n \"axolotlSenderKeyDistributionMessage\": {\n \"0\": 51,\n \"1\": 8,\n \"2\": 142,\n \"3\": 192,\n \"4\": 202,\n \"5\": 130,\n \"6\": 6,\n \"7\": 16,\n \"8\": 0,\n \"9\": 26,\n \"10\": 32,\n \"11\": 37,\n \"12\": 224,\n \"13\": 43,\n \"14\": 44,\n \"15\": 217,\n \"16\": 51,\n \"17\": 183,\n \"18\": 226,\n \"19\": 224,\n \"20\": 214,\n \"21\": 106,\n \"22\": 2,\n \"23\": 45,\n \"24\": 55,\n \"25\": 247,\n \"26\": 102,\n \"27\": 138,\n \"28\": 25,\n \"29\": 104,\n \"30\": 184,\n \"31\": 228,\n \"32\": 234,\n \"33\": 76,\n \"34\": 104,\n \"35\": 167,\n \"36\": 44,\n \"37\": 114,\n \"38\": 253,\n \"39\": 120,\n \"40\": 108,\n \"41\": 155,\n \"42\": 149,\n \"43\": 34,\n \"44\": 33,\n \"45\": 5,\n \"46\": 249,\n \"47\": 192,\n \"48\": 143,\n \"49\": 225,\n \"50\": 30,\n \"51\": 250,\n \"52\": 80,\n \"53\": 208,\n \"54\": 182,\n \"55\": 88,\n \"56\": 24,\n \"57\": 121,\n \"58\": 76,\n \"59\": 177,\n \"60\": 112,\n \"61\": 121,\n \"62\": 102,\n \"63\": 65,\n \"64\": 245,\n \"65\": 110,\n \"66\": 183,\n \"67\": 182,\n \"68\": 234,\n \"69\": 126,\n \"70\": 85,\n \"71\": 82,\n \"72\": 212,\n \"73\": 29,\n \"74\": 85,\n \"75\": 79,\n \"76\": 16,\n \"77\": 94\n }\n },\n \"messageContextInfo\": {\n \"threadId\": [],\n \"deviceListMetadata\": {\n \"senderKeyIndexes\": [],\n \"recipientKeyIndexes\": [],\n \"senderKeyHash\": {\n \"0\": 104,\n \"1\": 76,\n \"2\": 48,\n \"3\": 9,\n \"4\": 51,\n \"5\": 35,\n \"6\": 12,\n \"7\": 174,\n \"8\": 71,\n \"9\": 39\n },\n \"senderTimestamp\": {\n \"low\": 1780579585,\n \"high\": 0,\n \"unsigned\": true\n },\n \"recipientKeyHash\": {\n \"0\": 178,\n \"1\": 98,\n \"2\": 210,\n \"3\": 215,\n \"4\": 214,\n \"5\": 115,\n \"6\": 141,\n \"7\": 49,\n \"8\": 137,\n \"9\": 97\n },\n \"recipientTimestamp\": {\n \"low\": 1778571917,\n \"high\": 0,\n \"unsigned\": true\n }\n },\n \"deviceListMetadataVersion\": 2\n },\n \"reactionMessage\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": false,\n \"id\": \"AC927C4EFA14AC12F92D6E1F6465B70F\",\n \"participant\": \"207726193995780@lid\"\n },\n \"text\": \"\\u2764\\ufe0f\",\n \"senderTimestampMs\": {\n \"low\": -1595592022,\n \"high\": 414,\n \"unsigned\": false\n }\n }\n },\n \"messageType\": \"reactionMessage\",\n \"messageTimestamp\": 1780815836,\n \"instanceId\": \"b2bfaa30-d5f1-4b22-b979-791892619414\",\n \"source\": \"android\"\n },\n \"destination\": \"https://jimmy.huper.asia/webhooks/whatsapp-incoming\",\n \"date_time\": \"2026-06-07T04:03:57.250Z\",\n \"sender\": \"6590054438@s.whatsapp.net\",\n \"server_url\": \"https://evolution.huper.asia\",\n \"apikey\": \"5E4C532223A0-460B-A749-16C584F56E74\"\n}\n\nPrimary goal: REDUCE NOISE. Stay silent in the main Telegram chat unless the message truly needs Vincent's immediate attention. Approval-worthy reply drafts should go to the jimmy-wa-drafts sidecar, NOT to the main Telegram chat.\n\nFollow the whatsapp-evolution skill and /opt/data/agent-state/memory/whatsapp-vip-contacts.json, with these stricter notification rules:\n\nABSOLUTE SILENCE RULES:\n- If fromMe is true: ignore. Do NOT call send_message. Final response local log only.\n- Ignore status/update/read-receipt events or payloads with no real message content.\n- Do NOT send main Telegram summaries for ordinary DMs, routine logistics, casual family messages, generic group events, group chatter, forwards, memes, devotional/religious shares, links, standalone codes, normal confirmations, or suggested replies.\n- Do NOT ask \"Want me to add this to Calendar?\" for WhatsApp group events unless Vincent is explicitly invited, named, responsible, or the event is time-sensitive and clearly relevant to him today.\n- Do NOT surface family messages just because they are family. Only surface if urgent, safety-related, school/pickup critical, money/legal/medical, or explicitly requires Vincent now.\n\nSIDEcar RULES:\n- If a reply may be useful but it is not urgent, POST it to https://jimmy-draft.huper.asia/drafts using the whatsapp-evolution Draft Approvals flow. Do NOT duplicate that draft/summary in the main Telegram chat.\n- Prayer and birthday replies should go to the sidecar when appropriate, not the main Telegram chat, unless the underlying situation is urgent/sensitive and Vincent needs immediate awareness.\n- Before drafting any reply or approval suggestion, fetch recent context: python3 /opt/data/scripts/wa_recent_messages.py \"120363418226875471@g.us\" 30. For active/ambiguous groups use 50.\n\nMAIN TELEGRAM ALERTS ARE ALLOWED ONLY FOR:\n1. VIP contacts marked always_call or always_notify when the message is important/actionable, or any VIP call event.\n2. Calls from family/VIP that appear missed/ringing and likely need immediate attention. Avoid repeated alerts for accepted/terminated duplicate call state updates.\n3. Emergencies, safety issues, medical issues, school pickup/child logistics requiring action now, legal/money/account/security matters, or time-sensitive commitments in the next few hours.\n4. Messages explicitly asking Vincent to call back, decide now, approve payment, provide OTP/code for an active task, or respond urgently.\n5. Pipeline errors or webhook processing failures.\n\nCALL RULES:\n- VIP contacts (e.g. Dn Jeff, Ng Whye Keong): call Vincent immediately outside quiet hours when actionably important or on ringing/missed call. During 1am-10am SGT, Telegram only unless life-and-death emergency.\n- Do not call for casual VIP confirmations or non-actionable courtesy messages.\n\nTONE/RULES FOR DRAFTS:\n- Proposed replies must be Christian and grace-based where appropriate: warm, humble, encouraging, non-judgmental, peace-making, not preachy.\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- NEVER auto-reply directly on WhatsApp unless Vincent explicitly approves or future config marks that exact category pre-approved.\n\nDelivery rule:\n- For allowed main alerts only: call send_message(target=\"telegram\", message=...). Keep it concise.\n- For non-urgent reply suggestions: use the sidecar only.\n- For everything else: DO NOT call send_message. Keep final response as local-only log.\n", "timestamp": 1780815842.810316, "anchor": true}, {"id": 104147, "role": "assistant", "content": "This is a reaction message (❤️ emoji) from Charissa in a group chat. A simple emoji reaction with no substantive content, no request, no prayer, no actionable information.\n\n**Classification:** Broadcast/Silent — no real message content.\n\n**Action:** None required. No Telegram notification, no sidecar draft. Local log only.", "timestamp": 1780815842.8277056}], "bookend_end": [], "messages_before": 0, "messages_after": 2}, {"session_id": "20260607_070028_56fb6dcb", "when": "June 07, 2026 at 07:00 AM", "source": "webhook", "model": "glm-5.1", "title": null, "matched_role": "user", "match_message_id": 104143, "snippet": "...true`, they are replying to a >>>message<<< Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an >>>incoming<<< WhatsApp >>>message<<< is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification...", "bookend_start": [], "messages": [{"id": 104143, "role": "user", "content": "[IMPORTANT: The user has invoked the \"whatsapp-evolution\" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]\n\n---\nname: whatsapp-evolution\ndescription: WhatsApp messaging via Evolution API — send, receive, analyze, and auto-reply. Use when processing an incoming WhatsApp message delivered via the webhook, replying on Vincent's behalf, liaising with contacts, sending messages, or handling WhatsApp-related tasks. Covers intent analysis, auto-reply rules, escalation (VIP → phone call), conversation context fetching, and draft approvals via the jimmy-wa-drafts sidecar.\n---\n\n# WhatsApp via Evolution API\n\n## Credentials\nStored in `/opt/data/secrets/.whatsapp-evolution-credentials`:\n- `EVOLUTION_API_URL`, `EVOLUTION_API_KEY`, `EVOLUTION_INSTANCE`\n\nLoad with: `export $(cat .whatsapp-evolution-credentials | xargs)`\n\n## Sending Messages (direct, no approval)\n\n```bash\nexport $(cat .whatsapp-evolution-credentials | xargs)\ncurl -s -X POST \"${EVOLUTION_API_URL}/message/sendText/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"number\": \"6591234567\", \"text\": \"Hello!\"}'\n```\n\n- Phone format: country code, no `+` (e.g. `6590054438`)\n- Groups: use JID `{groupid}@g.us`\n- For media/location/contacts — see [references/api.md](references/api.md)\n\nUse direct send only when Vincent has explicitly commanded/approved the exact send, or for pre-approved auto-replies/follow-ups. For any reply that needs Vincent's review, use the sidecar (next section).\n\n### Direct replies when Vincent names an existing WhatsApp thread\n\nIf Vincent says to “reply to X” or “confirm with X” and has clearly approved the action, but the contact JID/number is not in the current prompt:\n\n1. Search recent session history first (`session_search`) for the contact name, clinic/business name, or appointment details. Prior webhook summaries often contain the exact `remoteJid`, alternate phone JID, sender name, and context.\n2. If session history identifies the WhatsApp phone JID, send directly only if Vincent’s instruction is explicit enough to act. Keep the text concise and natural.\n3. If only a LID JID is found, look for `remoteJidAlt`/alternate phone JID in the session summary or fetch contacts/chats before sending.\n4. Do not invent phone numbers from web search when replying to an existing WhatsApp conversation. Use web search only as supporting context, not as the destination, unless Vincent explicitly asks to contact a public number.\n\n\n### Sending trip/camp details from retrieved emails\n\nWhen Vincent explicitly asks to send extracted school trip/camp details to a child or family member, send directly. Keep it practical and grounded:\n- Include dates, reporting/pick-up times, flights, venue/itinerary highlights, deadlines, forms, and safety/medication notes when found.\n- If an official packing list is missing, say that clearly. Do not invent one. You may add a short “based on itinerary/weather, prepare…” list, clearly labeled as practical guidance rather than official school instructions.\n- Avoid long explanations about the search process in WhatsApp. Send the useful details.\n\n### Sending a local image/file as media\n\nFor user-approved media sends, base64-encode the local file and call `sendMedia`:\n\n```bash\nexport $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\npython3 - <<'PY'\nimport base64, json\npath = '/tmp/file.png'\npayload = {\n 'number': '6591234567',\n 'mediatype': 'image',\n 'mimetype': 'image/png',\n 'caption': 'Caption here',\n 'media': base64.b64encode(open(path, 'rb').read()).decode(),\n 'fileName': 'file.png',\n}\nopen('/tmp/wa_media_payload.json', 'w').write(json.dumps(payload))\nPY\ncurl -s --max-time 60 -X POST \"${EVOLUTION_API_URL}/message/sendMedia/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_media_payload.json\n```\n\nWhen sending QR tickets or passes, verify the crop/image contains the full QR code and the intended attendee name before sending; the QR screenshot is effectively the redeemable ticket.\n\n## Fetching Conversation Context\n\nBefore drafting or proposing any reply, always fetch recent chat history for context:\n\n```bash\npython3 /opt/data/scripts/wa_recent_messages.py \"6591234567@s.whatsapp.net\" 30\n```\n\nUse `limit=30` by default. For active group chats or ambiguous situations, fetch up to 50 messages. Read enough context to understand who is being discussed, current tone, previous commitments, and whether Vincent has already replied. The script returns newest messages first (descending order).\n\n## Draft Approvals — `jimmy-wa-drafts` sidecar\n\nWhen you need Vincent to approve a WhatsApp reply, POST the draft to the sidecar. It owns the entire approval flow: Telegram message with 4 inline buttons, Telegram WebApp for Edit/Context, sending to WhatsApp via Evolution API on Approve.\n\n### Posting a draft\n\nAlways include an `extras` object with the original WA context. The sidecar passes `extras` through verbatim if Vincent later clicks \"Context\" — without it, your redraft will be flying blind.\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\n \"contact\": \"120363351787895353@g.us\",\n \"contact_name\": \"Cycling group\",\n \"summary\": \"Karen sent a blessing over the bike ride; she's not joining\",\n \"draft_text\": \"Thanks Karen, will miss you both — praying for a good time too 🙏\",\n \"extras\": {\n \"group_jid\": \"120363351787895353@g.us\",\n \"sender_name\": \"Karen Tan\",\n \"sender_jid\": \"6597630553@s.whatsapp.net\",\n \"category\": \"prayer/blessing\",\n \"original_message\": \"Happy cycling, to those who can make it… Dn Jeff and I had made plans for this PM so we won't be able to join y'all. Praying the Lord's favor, protection and good-weather blessings be upon y'all and your route!\",\n \"extracted\": {\n \"what\": \"Cycling from East Coast Park to Changi Village, then dinner\",\n \"when\": \"Today, 4:00 PM\",\n \"where\": \"East Coast Park to Changi Village\",\n \"who\": \"Group members who voted yes for cycling/dinner or dinner\"\n },\n \"confidence\": \"high\",\n \"why_appropriate\": \"Multiple group messages are blessings/wishes; Vincent has not replied; tone matches grace-based norm.\",\n \"wa_message_id\": \"ABC123XYZ\"\n }\n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\nResponse: `{\"draft_id\":\"abc123def\",\"telegram_message_id\":42}`\n\nWhat to put in `extras` (use whatever makes sense for the message type):\n\n- **Always**: `original_message` (verbatim WA text), `sender_name`, `category` (e.g. `prayer`, `birthday`, `event`, `dm-reply`, `enquiry`)\n- **Group messages**: `group_jid`, `sender_jid`\n- **DMs**: `sender_jid`\n- **Strongly recommended (drives the rich Telegram render)**: `extracted` (dict of who/what/when/where), `confidence` (`high`/`medium`/`low`), `why_appropriate` (one-line reasoning shown to Vincent so he can decide)\n- **Optional**: `wa_message_id`, `tone_hint` (e.g. `warm-grace-based`), `sender_phone` (auto-derived from `sender_jid` if omitted)\n\nVincent receives a Telegram message in the `@jimmy_WA_bot` chat with 4 inline buttons. The sidecar handles every interaction; **you do not track replies, callbacks, or message IDs**. Just POST and move on.\n\n| Button | Behavior |\n|--------|----------|\n| **Approve** | Sidecar sends draft to WhatsApp |\n| **Edit** | Opens a Telegram WebApp with draft pre-filled; Vincent edits + sends; sidecar pushes to WhatsApp |\n| **Context** | Opens a Telegram WebApp with empty textarea + the original WA message visible; Vincent provides context. (When `HERMES_REDRAFT_WEBHOOK` is wired in the sidecar's env, this will POST back to Jimmy for a redraft. Until then, the context is just stored on the draft.) |\n| **Pass** | Sidecar deletes the draft |\n\n### JSON escaping when `draft_text` contains quotes\n\nSingle quotes (`It's`, `God's`) break shell interpolation. Write the JSON to a temp file first:\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\"contact\":\"6591234567@s.whatsapp.net\",\"contact_name\":\"John\",\"summary\":\"…\",\"draft_text\":\"It's nice to hear from you!\"}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\n### Networking\n\n`https://jimmy-draft.huper.asia` is the public HTTPS endpoint of the sidecar (TLS-terminated by Traefik on ubuntu-server). HTTPS is required because Hermes blocks plain-HTTP commands via tirith. The internal Docker DNS hostname http://jimmy-wa-v64mqe:8080 also works for non-shell calls but the agent is forbidden from using it.\n\n### Sidecar status & logs\n\n- Source: (auto-deploys on push)\n- Public URL: (`/health` for ping)\n- Container: `jimmy-wa-v64mqe.1.` on ubuntu-server\n- Bot: `@jimmy_WA_bot`\n\n## Handling a Redraft Trigger (`wa-redraft` webhook)\n\nWhen Vincent clicks the **Context** button on a draft, the sidecar:\n1. Edits the original Telegram message to \"🔄 Adding context for X… agent is redrafting.\" (removes the old buttons so Vincent can't approve the now-stale text)\n2. POSTs to your `wa-redraft` webhook (`http://localhost:8644/webhooks/wa-redraft`) with the original draft fields plus Vincent's `context` instructions and the `extras` you stored when you first created the draft.\n\nWhen you receive a `wa-redraft` activation, the prompt template will lay out the fields. Your job:\n\n1. Read Vincent's `context` carefully — that's the steering signal. The `previous_draft` is just history, not necessarily a starting point.\n2. Re-fetch recent WA history if the original context isn't enough (`python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`). Use `extras.group_jid` for groups or `contact` for DMs.\n3. Write a NEW draft that addresses Vincent's instructions while staying faithful to the original WA context (in `extras.original_message`, `extras.extracted`).\n4. POST the new draft back to the sidecar **with the same `draft_id`** so the existing draft is updated, not duplicated:\n\n```bash\ncat > /tmp/wa_redraft.json <\",\n \"contact\": \"\",\n \"contact_name\": \"\",\n \"summary\": \"\",\n \"draft_text\": \"\",\n \"extras\": \n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_redraft.json\n```\n\nThe sidecar will send a NEW Telegram message with 4 fresh buttons. Vincent can Approve, Edit, Context-again (loop), or Pass.\n\nDo **not** send to WhatsApp directly during a redraft — the approval flow is the sidecar's job.\n\n## Incoming Message Processing\n\n### Delivery Pipeline (current)\n1. Evolution API POSTs each `MESSAGES_UPSERT` webhook directly to Hermes at `http://localhost:8644/webhooks/whatsapp-incoming` (proxied via `jimmy.huper.asia/webhooks/whatsapp-incoming`)\n2. Hermes webhook route (`whatsapp-incoming`, dynamic subscription) validates and spawns an agent\n3. Agent loads `whatsapp-evolution` skill, follows the rules below\n4. Webhook subscription uses `deliver: log` because webhooks do not support `deliver: local`; the agent explicitly calls `send_message(target=\"telegram\", ...)` only for important/actionable/VIP/event/birthday/prayer/error items, OR posts a draft to the sidecar for approval items.\n\n**No more OpenClaw, no Python webhook server, no port 5001, no keepalive cron.** If messages aren't being processed, check: nginx proxy to port 8644, `hermes webhook list` for the subscription, Evolution API webhook URL setting.\n\n## Contact Rules & Escalation\n\nConfig: `memory/whatsapp-vip-contacts.json`\n\n### Priority Levels\n\n| Priority | Action | Example |\n|----------|--------|---------|\n| **VIP (call-always)** | **CALL Vincent immediately via Twilio without asking.** Do NOT ask for approval first. During quiet hours (1am-10am SGT) suppress call and use Telegram notification only, unless life-and-death emergency. Match VIPs by both phone JID (`659...@s.whatsapp.net`) and LID JID (`...@lid`) when present, because recent chats may arrive with `key.remoteJid` as an LID and the phone number only in `key.remoteJidAlt`. | Dn Jeff (+6597630553) |\n| **Family** | Always notify Vincent on Telegram | ps112 group, family members |\n| **Auto-reply** | Reply based on context file, notify Vincent | Repair shops |\n| **Normal** | Analyze intent, reply if appropriate, notify if important | Everyone else |\n| **Broadcast/Spam** | Ignore silently | Forwarded job posts, chain messages |\n\n### Quiet Hours\n- 1am-10am SGT (17:00-02:00 UTC): avoid phone calls, including for Dn Jeff / VIP call-always contacts\n- **Exception:** life-and-death emergencies always call regardless of quiet hours\n- During quiet hours, use Telegram notification only and mention that a call was suppressed due to quiet hours\n\n## Making a Call via Twilio\n\nCredentials: `/opt/data/secrets/.twilio-credentials-new` (`TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN`, `TWILIO_PHONE_NUMBER`). From: `+16614809287`, Vincent: `+6590054438`.\n\nPreferred script: `/opt/data/scripts/call_vincent.py`, now using the Twilio Python SDK baked into the container.\n\n```bash\npython3 /opt/data/scripts/call_vincent.py \"Hello Vincent, this is Jimmy. Urgent summary here.\"\n```\n\n## Action Rules\n\n### Default: DO NOT auto-reply directly, and do not spam main Telegram\nCollect intent and stay local unless something truly needs Vincent's immediate attention.\n\n**When you need Vincent to approve a reply, POST a draft to the sidecar (see \"Draft Approvals\" above).** The sidecar handles the Telegram notification with buttons, the WebApp for editing, and sending to WhatsApp on approve. Do **not** duplicate that draft or summary in the main Telegram chat.\n\nMain Telegram WhatsApp summaries are muted by default. Only use `send_message(target=\"telegram\", ...)` for urgent/important/actionable items that need Vincent now: VIP actionable messages/calls, missed/ringing family calls, emergencies, safety/medical/school-pickup issues, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors. Routine logistics, casual family pings, generic group events, links, standalone codes, normal confirmations, and suggested replies should be local-only or sidecar-only.\n\nDo NOT send WhatsApp replies directly unless Vincent explicitly approves them or the category is pre-approved.\n\n### Tone (when proposing or sending replies)\n- Casual, direct, no sign-offs (no \"Thanks & Regards\", no \"- Vincent\")\n- Ask questions instead of giving instructions\n- Keep it natural — like texting a friend\n- Christian and grace-based: warm, humble, encouraging, non-judgmental, peace-making\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- For prayer/birthday contexts, use gentle faith-filled language; avoid sounding preachy, performative, or overly formal\n\n### What to Do With Each Message\n\n**DMs (personal chats):**\n- Analyze intent\n- Notify Vincent on Telegram (via `send_message(target=\"telegram\", ...)`) only if important/actionable/time-sensitive, requires a decision, comes from a VIP contact\n- If you have a useful suggested reply, POST it as a draft to the sidecar instead of a Telegram notification — Vincent gets the buttons inline\n- Casual/social/non-actionable DMs → local log only, no Telegram notification, no draft\n- Do NOT reply unless Vincent says to\n- VIP contacts → **call Vincent immediately without asking**. Do NOT post a draft. Just call. During quiet hours (1am-10am SGT), suppress the call and send Telegram notification only, unless life-and-death emergency.\n\n**Group chats — events/meetups:**\n- If someone posts about an event, gathering, meetup, dinner, etc:\n 1. Extract: what, when, where, who\n 2. Notify Vincent on Telegram with details (no draft needed unless you also want to suggest an RSVP reply)\n 3. Ask: \"Want me to add this to Google Calendar and set a reminder?\"\n\n**Group chats — prayer requests / praying for others:**\n- If someone asks for prayer, shares a burden, illness, grief, struggle, thanksgiving, or asks the group to pray:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Identify who/what to pray for and any sensitive details\n 3. Draft a short, grace-based Christian reply/prayer\n 4. POST it to the sidecar with `summary` describing the request and `draft_text` being the prayer\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — birthday greetings:**\n- If you see birthday wishes for someone in a group:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Note who the birthday person is and relationship/context if inferable\n 3. Draft a warm, grace-based Christian birthday greeting\n 4. POST it to the sidecar\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — general:**\n- Casual banter, memes, forwards, broadcast spam → ignore silently\n- Religious content shares → ignore unless specifically addressed to Vincent\n\n### When to Notify Vincent (Telegram) vs Post a Draft\n\n**Use `send_message(target=\"telegram\", ...)` sparingly** for plain notifications only when the message needs Vincent's immediate attention: VIP actionable DMs/calls, emergencies, school pickup/child logistics requiring action now, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors.\n\n**Use the sidecar (POST `/drafts`)** whenever you have a suggested reply for Vincent to approve: prayer responses, birthday greetings, DM replies, group RSVPs. Do not also send the same summary to the main Telegram chat.\n\nDo **not** notify or draft for casual/social/non-actionable DMs, routine logistics, read receipts/status updates, religious shares not addressed to Vincent, memes, forwarded/broadcast/spam, generic group events, standalone codes, normal confirmations, or general group chatter. Family messages are not automatically main-chat alerts unless urgent, safety-related, school/pickup critical, or explicitly requiring Vincent now.\n\n## Intent Categories\n\nWhen analyzing incoming messages, classify as:\n\n1. **Urgent** — time-sensitive, emergencies → call Vincent outside quiet hours; during quiet hours use Telegram only unless life-and-death\n2. **Important** — decisions, money, appointments, anything consequential → call Vincent outside quiet hours; during quiet hours notify on Telegram only unless life-threatening\n3. **Request** — asking Vincent to do something → **notify Vincent** (Telegram) or post a draft if you have a safe reply\n4. **Enquiry** — asking for info, pricing, availability → auto-reply if context available, otherwise post a draft\n5. **Confirmation** — confirming plans, bookings → acknowledge and notify Vincent\n6. **Casual/Social** — greetings, memes, forwards → usually ignore or react\n7. **Broadcast/Spam** — forwarded job posts, chain messages → ignore\n\n## Finding Individual Contacts and JIDs\n\nWhen Vincent asks to monitor or prioritize a specific WhatsApp contact and only gives a name:\n\n1. Query contacts and chats, then search both display names and serialized message metadata:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n python3 - <<'PY'\n import os, requests, json\n base=os.environ['EVOLUTION_API_URL']; inst=os.environ['EVOLUTION_INSTANCE']; key=os.environ['EVOLUTION_API_KEY']\n headers={'apikey':key,'Content-Type':'application/json'}\n terms=['whye','keong'] # replace with lowercase search terms\n for ep in ['findContacts','findChats']:\n r=requests.post(f'{base}/chat/{ep}/{inst}', headers=headers, json={}, timeout=30)\n print('endpoint', ep, 'status', r.status_code)\n for obj in (r.json() if r.ok else []):\n if any(t in json.dumps(obj, ensure_ascii=False).lower() for t in terms):\n print(json.dumps(obj, ensure_ascii=False)[:1000])\n PY\n ```\n2. If `findChats` shows `remoteJid` ending in `@lid`, inspect `lastMessage.key.remoteJidAlt` for the phone JID. Store both in `memory/whatsapp-vip-contacts.json` when configuring VIP/call rules.\n3. Verify the number exists with `/chat/whatsappNumbers/{instance}` before finalizing.\n4. Validate the JSON config after editing: `python3 -m json.tool /opt/data/agent-state/memory/whatsapp-vip-contacts.json >/tmp/wa_vip_validated.json`.\n\n## Finding Group JIDs\n\nThe `group/fetchGroups` endpoint returns 404 on some Evolution API instances (including ours). To find a group JID when you don't have it:\n\n1. **Use `chat/findMessages` with an empty `where` filter** to pull recent messages across all chats, then extract unique `remoteJid` values containing `@g.us`.\n2. **Match by content**: Fetch messages from each group JID using `where.key.remoteJid` and match sender names or message text to identify the target group.\n3. **Known group JIDs** (store in `whatsapp-vip-contacts.json` as you discover them):\n - ps112 family group: `6591887488-1464061170@g.us`\n - Overcomers group: `6591444911-1614821287@g.us`\n - 神的华美 group: `6596756578-1405873840@g.us` (content-sharing, casual/broadcast, local log only)\n - Overcomers/TCians group: `120363230236878544@g.us` (church fellowship, PJ sends weekly attendance reports, routine attendance = local log only)\n\nExample search:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"6591444911-1614821287@g.us\"}}, \"limit\": 10, \"order\": \"desc\"}'\n```\n\n## Processing Image/Media Messages\n\nWhen an incoming WhatsApp message is an image (imageMessage), video, or other media with no text or only a caption:\n\n1. **Fetch conversation context** as usual with `wa_recent_messages.py`\n2. **Download the media** using the Evolution API:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"message\": {\"key\": {\"id\": \"MESSAGE_ID\", \"fromMe\": false, \"remoteJid\": \"GROUP_OR_CHAT_JID\"}}, \"convertToMp4\": false}' \\\n -o /tmp/wa_media_response.json\n ```\n Then decode: `python3 -c \"import json,base64; d=json.load(open('/tmp/wa_media_response.json')); open('/tmp/wa_media.jpg','wb').write(base64.b64decode(d['base64']))\"`\n3. **Analyze the image** using `vision_analyze` tool with `/tmp/wa_media.jpg` path and a context-aware question\n4. **Fallback if vision is unavailable**: Use `tesseract /tmp/wa_media.jpg stdout` for OCR, combined with pixel analysis (PIL/numpy) and conversation context to infer what the image contains\n5. **Album messages** (`albumMessage`): These bundle multiple images. Each image arrives as a separate `imageMessage` with a `messageAssociation.parentMessageKey` linking them to the album. Download each image separately. The `expectedImageCount` field tells how many images to expect\n6. **messageAssociation**: When images have `parentMessageKey` with `fromMe: true`, they are replying to a message Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an incoming WhatsApp message is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification when possible:\n\n1. **Fetch conversation context first**, especially in groups:\n ```bash\n python3 /opt/data/scripts/wa_recent_messages.py \"\" 50\n ```\n2. **Download the audio** with Evolution API, saving the response to a file:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_audio_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"GROUP_OR_CHAT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_audio_req.json -o /tmp/wa_audio_response.json\n python3 - <<'PY'\n import json, base64\n d=json.load(open('/tmp/wa_audio_response.json'))\n b=d.get('base64') or d.get('data',{}).get('base64')\n if ',' in b[:100]: b=b.split(',',1)[1]\n open('/tmp/wa_audio.ogg','wb').write(base64.b64decode(b))\n PY\n ```\n3. **Convert OGG/Opus to WAV**:\n ```bash\n ffmpeg -y -i /tmp/wa_audio.ogg -ar 16000 -ac 1 /tmp/wa_audio.wav\n ```\n4. **Transcribe fallback when Whisper is unavailable**: the container may not have `whisper`, `faster-whisper`, `torch`, or `speech_recognition` installed globally. A reliable lightweight fallback is a temp venv with SpeechRecognition and Google web speech:\n ```bash\n python3 -m venv /tmp/srvenv\n /tmp/srvenv/bin/pip install --quiet SpeechRecognition\n /tmp/srvenv/bin/python - <<'PY'\n import speech_recognition as sr\n r=sr.Recognizer()\n with sr.AudioFile('/tmp/wa_audio.wav') as source:\n audio=r.record(source)\n print(r.recognize_google(audio, language='en-SG'))\n PY\n ```\n5. **Check adjacent voice notes from the same sender**. WhatsApp voice notes are often sent in quick sequential chunks. Download and transcribe nearby `audioMessage` records from the same sender before deciding whether to notify Vincent.\n6. Use the transcript plus recent context to classify the message. For group meetup planning, extract what/when/where/who and notify Vincent if actionable, including transcript confidence.\n\n### Video Messages\n\nWhen an incoming WhatsApp message is `videoMessage` (with or without a caption):\n\n1. **Fetch conversation context** first, as with any media message: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n2. **Download the video** using Evolution API (same pattern as images/audio — save response to file, then decode):\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_video_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"CONTACT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_video_req.json -o /tmp/wa_video_response.json\n python3 - <<'PY'\n import json, base64\n d = json.load(open('/tmp/wa_video_response.json'))\n b = d.get('base64') or d.get('data', {}).get('base64', '')\n if ',' in b[:100]: b = b.split(',', 1)[1]\n open('/tmp/wa_video.mp4', 'wb').write(base64.b64decode(b))\n print(f\"Video saved: {len(base64.b64decode(b))} bytes\")\n PY\n ```\n3. **Extract representative frames** with ffmpeg (do NOT try to analyze video directly — vision tools work on images):\n ```bash\n # Frame at start\n ffmpeg -y -ss 0 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame0.jpg\n # Frame at ~1/3 of the way (good for short videos)\n ffmpeg -y -ss 3 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame3s.jpg\n # If video is longer (>15s), also extract at midpoint\n # ffprobe -v quiet -print_format json -show_format /tmp/wa_video.mp4 # get duration\n # ffmpeg -y -ss $HALF_DURATION -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame_mid.jpg\n ```\n4. **Analyze the frames** with `vision_analyze`. Ask specific questions about content: what objects/people/text are visible, what type of video this appears to be (screen recording, personal clip, forwarded meme, etc.), and any text or identifying details.\n5. **Use the caption** from `videoMessage.caption` if present (can be empty/null). Combine frame analysis + caption + conversation context to classify the message intent.\n6. **Transcription of audio in video** is generally not feasible with available tools. If the video likely contains important spoken content (e.g. voice memo style), note this limitation and flag it to Vincent.\n\n**Key points:**\n- Always use `-o` flag and file-based decode; videos can be 1-10MB and base64 responses will be even larger\n- `--max-time 25` on curl is essential for video downloads\n- Short casual clips (showing a key fob, a parking spot, quick view of surroundings) are usually casual/social — classify accordingly and don't over-escalate\n- If `vision_analyze` fails (400 on models that don't support images, or 429 rate limit), fall back to conversation context inference alone\n\n### Poll Messages\nWhen `pollUpdateMessage` appears in chat history (someone voted), the message contains a `pollCreationMessageKey` with the original poll's message `id`. To get the poll question and options, query:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"id\": \"POLL_MESSAGE_ID\", \"remoteJid\": \"GROUP_JID\"}}, \"limit\": 1}'\n```\nThe response will have `pollCreationMessageV3` with `name` (the poll question) and `options` array. Always look up the poll when votes or poll references appear, since the poll text contains the event details.\n\n### `associatedChildMessage` (Motion Photos / Companion Media)\nWhatsApp motion photos and companion media arrive as `associatedChildMessage` type. These are short videos or supplementary images attached to a parent message via `messageAssociation.parentMessageKey`. The structure is:\n\n```\nmessage: {\n messageContextInfo: {\n messageAssociation: {\n associationType: 12,\n parentMessageKey: { id, fromMe, remoteJid }\n }\n },\n associatedChildMessage: {\n message: {\n videoMessage: { ... } // or imageMessage: { ... }\n }\n }\n}\n```\n\nKey points:\n- The `parentMessageKey.id` links back to the main message (the image/video they accompany)\n- When `parentMessageKey.fromMe: true`, the companion media is replying to a message Vincent sent\n- Captions may appear on the child's `videoMessage.caption` or `imageMessage.caption` inside the nested message, NOT at the top level\n- These often contain motion-photo short videos (3 seconds) or album companion images\n- Treat them as media attachments to the parent; don't process them as standalone messages requiring independent intent analysis\n\n### Sequential Image Sets (Non-Album)\nMultiple images sent rapidly by the same sender at similar timestamps may NOT be wrapped in an `albumMessage`. They appear as separate `imageMessage` entries. Always check for adjacent images from the same sender and download+OCR all of them together, as they often form a coherent multi-page set (e.g., route info, comparison tables, maps). Missing one can lose important context.\n\n### Key pitfalls\n- The Evolution API `getBase64FromMediaMessage` endpoint returns the full base64 payload (can be 100KB+ for images). Use `-o` flag to save to file, then parse with Python. Do NOT try to pipe it directly\n- `curl` with `--max-time 25` is essential; without it, large media responses can hang\n- The `albumMessage` type itself has no `fromMe` field; check each individual `imageMessage` instead\n- Image thumbnails in the webhook payload are truncated; always download the full image for analysis\n- **WhatsApp media URLs expire**: CDN download links (`mmg.whatsapp.net`) expire after roughly 24-48 hours. When `getBase64FromMediaMessage` returns a 400/404 error like \"Failed to fetch stream from https://mmg.whatsapp.net/...\", the media is no longer retrievable via the API. This means you CANNOT download adjacent/older voice notes, images, or documents for context if they're more than a day or two old. Fall back to conversation text context only. Fresh/incoming messages are almost always still downloadable.\n- **Voice note transcription reliability**: Short voice notes (<15s), especially in Singlish/Chinese-English mix common in Singapore contacts, often produce unreliable transcriptions via SpeechRecognition+Google. Always flag low-confidence transcriptions to Vincent rather than acting on potentially garbled text.\n- **Vision API failures**: `vision_analyze` can fail with 429 (rate limit) **or 400** (model doesn't support image input, e.g. glm-5.1 via ollama-cloud). Always have a fallback plan. When vision is unavailable, try tesseract OCR, then pixel analysis (PIL/numpy for brightness zones, text region detection via row-level dark-pixel density)\n- **OCR can fail on decorative/complex images**: Obituaries, memorial notices, religious flyers with ornate fonts, or dark-background images often defeat tesseract even with aggressive preprocessing (upscaling, thresholding, inverting, PSM mode variations). When OCR returns gibberish or empty, fall back to conversation context inference: use preceding/surrounding messages to determine what the image likely contains, then draft an appropriate response based on that context\n- **OCR on devotional/dark-background images**: These often defeat tesseract on full-image scans. Use a multi-strategy approach: (1) run tesseract on the full image first, (2) if text is sparse, crop into regions (top 15%, middle, bottom 15%), upscale 2-3x, enhance contrast (2-3x), and try both normal and inverted (`ImageOps.invert`) versions with `--psm 6`, (3) combine all partial OCR results. For Exodus/verse images, the title (e.g. \"WHEN MOSES' HANDS WERE\") is often in decorative font that OCR misses, while the verse body text is usually readable.\n- **Prioritize conversation context**: When processing group images, always fetch recent context FIRST (before attempting vision/OCR). If image analysis fails, the conversation history often makes the image's purpose clear enough to draft a meaningful reply\n- **DM conversation context is one-sided (fromMe only)**: The Evolution API `findMessages` endpoint often returns only Vincent's sent messages (`fromMe: true`) for DM chats, not the contact's incoming messages. This means `wa_recent_messages.py` can show only one side of the conversation. When processing a DM webhook, do NOT rely solely on fetched history for the contact's side. The webhook payload already contains the incoming message text. Use what the webhook gives you plus Vincent's side from the API. If you need the contact's recent messages, check if `fromMe: false` records exist; if none are returned, the API simply doesn't have them and you must work with what you have (webhook message + Vincent's replies for context).\n- **`findMessages` ignores `key.fromMe` filter**: Passing `\"key.fromMe\": false` in the `where` clause still returns all messages (including `fromMe: true`). To find messages from a specific sender, fetch results and filter client-side by checking `fromMe` in Python.\n- **`findMessages` caps at ~50 records per request**: Even with `limit: 200` or `limit: 500`, the API returns at most 50 records. When one party sends many consecutive messages, the other party's messages may not appear in the first 50 results. Use pagination or fetch multiple pages if you need older messages.\n- **Video frame extraction**: Use `ffmpeg -y -ss N -i video.mp4 -frames:v 1 -update 1 output.jpg` (the `-update 1` flag is needed; without it ffmpeg treats the output as an image sequence and fails). Extract frames at multiple timestamps (start, ~1/3, midpoint for longer videos) since a single frame may miss the key content.\n- **ffmpeg \"Invalid color space\" failure on WhatsApp videos**: Some WhatsApp H.264 videos have `color_space=reserved` and `color_primaries=reserved` metadata that ffmpeg cannot decode at all. Every codec/format/filter combination fails with \"Invalid color space\" / return code 234. This is NOT fixable with `-vf setparams`, `-color_trc`, `-colorspace`, `-color_primaries`, or `-vf format=` flags because the error occurs at the decoder level before any filter runs. **Do not waste time trying ffmpeg workarounds.** When this happens, fall back to: (1) conversation context inference (who sent it, what thread it's in, adjacent messages), (2) the jpegThumbnail embedded in the videoMessage (small but gives a rough idea; base64-encoded JPEG, typically 600-900 bytes, use `vision_analyze` on it if available), or (3) if all analysis fails, just classify based on surrounding text context.\n- **`wa_recent_messages.py` JSON output breaks `json.loads()`**: The script's stdout contains WhatsApp message content with control characters (emoji, CJK, deviceListMetadata binary blobs) that cause `json.loads()` to fail with `Invalid control character` or `Expecting ',' delimiter` errors. Even `strict=False` does not reliably fix it. **Reliable approach**: use `curl -o /tmp/wa_msgs.json` to save the Evolution API response directly to a file (the file write handles binary safely), then parse with `json.load(open('/tmp/wa_msgs.json'))` in Python. Example:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 15 -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"CONTACT_JID\"}}, \"limit\": 20}' -o /tmp/wa_msgs.json\n ```\n Then in Python: `data = json.load(open('/tmp/wa_msgs.json')); records = data['messages']['records']`\n\n## API Reference\n\nFor full API endpoints (media, contacts, location, buttons) — see [references/api.md](references/api.md).\n\n[Skill directory: /opt/data/skills/personal/whatsapp-evolution]\nResolve any relative paths in this skill (e.g. `scripts/foo.js`, `templates/config.yaml`) against that directory, then run them with the terminal tool using the absolute path.\n\n[This skill has supporting files:]\n- references/api.md -> /opt/data/skills/personal/whatsapp-evolution/references/api.md\n- scripts/process_queue.py -> /opt/data/skills/personal/whatsapp-evolution/scripts/process_queue.py\n- scripts/send_text.sh -> /opt/data/skills/personal/whatsapp-evolution/scripts/send_text.sh\n\nLoad any of these with skill_view(name=\"personal/whatsapp-evolution\", file_path=\"\"), or run scripts directly by absolute path (e.g. `node /opt/data/skills/personal/whatsapp-evolution/scripts/foo.js`).\n\nThe user has provided the following instruction alongside the skill invocation: Incoming WhatsApp webhook received.\n\nEvent: messages.upsert\nInstance: n8n-WA\nSender: Joelle\nSender JID: 120363418226875471@g.us\nFrom me: False\nMessage type: reactionMessage\nMessage: {\n \"event\": \"messages.upsert\",\n \"instance\": \"n8n-WA\",\n \"data\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": false,\n \"id\": \"3AC247BC5A50B4F3CFF1\",\n \"participant\": \"192260067090531@lid\",\n \"participantAlt\": \"6594519406@s.whatsapp.net\",\n \"addressingMode\": \"lid\"\n },\n \"pushName\": \"Joelle\",\n \"status\": \"DELIVERY_ACK\",\n \"message\": {\n \"reactionMessage\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": true,\n \"id\": \"AC37643983DC88906B4D0F1A09F74082\",\n \"participant\": \"54644919296042@lid\"\n },\n \"text\": \"\\u2764\\ufe0f\",\n \"senderTimestampMs\": {\n \"low\": -1595799960,\n \"high\": 414,\n \"unsigned\": false\n }\n }\n },\n \"messageType\": \"reactionMessage\",\n \"messageTimestamp\": 1780815628,\n \"instanceId\": \"b2bfaa30-d5f1-4b22-b979-791892619414\",\n \"source\": \"ios\"\n },\n \"destination\": \"https://jimmy.huper.asia/webhooks/whatsapp-incoming\",\n \"date_time\": \"2026-06-07T04:00:28.852Z\",\n \"sender\": \"6590054438@s.whatsapp.net\",\n \"server_url\": \"https://evolution.huper.asia\",\n \"apikey\": \"5E4C532223A0-460B-A749-16C584F56E74\"\n}\n\nPrimary goal: REDUCE NOISE. Stay silent in the main Telegram chat unless the message truly needs Vincent's immediate attention. Approval-worthy reply drafts should go to the jimmy-wa-drafts sidecar, NOT to the main Telegram chat.\n\nFollow the whatsapp-evolution skill and /opt/data/agent-state/memory/whatsapp-vip-contacts.json, with these stricter notification rules:\n\nABSOLUTE SILENCE RULES:\n- If fromMe is true: ignore. Do NOT call send_message. Final response local log only.\n- Ignore status/update/read-receipt events or payloads with no real message content.\n- Do NOT send main Telegram summaries for ordinary DMs, routine logistics, casual family messages, generic group events, group chatter, forwards, memes, devotional/religious shares, links, standalone codes, normal confirmations, or suggested replies.\n- Do NOT ask \"Want me to add this to Calendar?\" for WhatsApp group events unless Vincent is explicitly invited, named, responsible, or the event is time-sensitive and clearly relevant to him today.\n- Do NOT surface family messages just because they are family. Only surface if urgent, safety-related, school/pickup critical, money/legal/medical, or explicitly requires Vincent now.\n\nSIDEcar RULES:\n- If a reply may be useful but it is not urgent, POST it to https://jimmy-draft.huper.asia/drafts using the whatsapp-evolution Draft Approvals flow. Do NOT duplicate that draft/summary in the main Telegram chat.\n- Prayer and birthday replies should go to the sidecar when appropriate, not the main Telegram chat, unless the underlying situation is urgent/sensitive and Vincent needs immediate awareness.\n- Before drafting any reply or approval suggestion, fetch recent context: python3 /opt/data/scripts/wa_recent_messages.py \"120363418226875471@g.us\" 30. For active/ambiguous groups use 50.\n\nMAIN TELEGRAM ALERTS ARE ALLOWED ONLY FOR:\n1. VIP contacts marked always_call or always_notify when the message is important/actionable, or any VIP call event.\n2. Calls from family/VIP that appear missed/ringing and likely need immediate attention. Avoid repeated alerts for accepted/terminated duplicate call state updates.\n3. Emergencies, safety issues, medical issues, school pickup/child logistics requiring action now, legal/money/account/security matters, or time-sensitive commitments in the next few hours.\n4. Messages explicitly asking Vincent to call back, decide now, approve payment, provide OTP/code for an active task, or respond urgently.\n5. Pipeline errors or webhook processing failures.\n\nCALL RULES:\n- VIP contacts (e.g. Dn Jeff, Ng Whye Keong): call Vincent immediately outside quiet hours when actionably important or on ringing/missed call. During 1am-10am SGT, Telegram only unless life-and-death emergency.\n- Do not call for casual VIP confirmations or non-actionable courtesy messages.\n\nTONE/RULES FOR DRAFTS:\n- Proposed replies must be Christian and grace-based where appropriate: warm, humble, encouraging, non-judgmental, peace-making, not preachy.\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- NEVER auto-reply directly on WhatsApp unless Vincent explicitly approves or future config marks that exact category pre-approved.\n\nDelivery rule:\n- For allowed main alerts only: call send_message(target=\"telegram\", message=...). Keep it concise.\n- For non-urgent reply suggestions: use the sidecar only.\n- For everything else: DO NOT call send_message. Keep final response as local-only log.\n", "timestamp": 1780815636.1526937, "anchor": true}, {"id": 104144, "role": "assistant", "content": "**Local log only.** Joelle reacted with ❤️ to one of Vincent's messages in group `120363418226875471@g.us`. This is a reaction/receipt with no actionable content. No Telegram notification, no draft, no reply needed.", "timestamp": 1780815636.2674494}], "bookend_end": [], "messages_before": 0, "messages_after": 2}, {"session_id": "20260607_070026_cdef0734", "when": "June 07, 2026 at 07:00 AM", "source": "webhook", "model": "glm-5.1", "title": null, "matched_role": "user", "match_message_id": 104140, "snippet": "...true`, they are replying to a >>>message<<< Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an >>>incoming<<< WhatsApp >>>message<<< is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification...", "bookend_start": [], "messages": [{"id": 104140, "role": "user", "content": "[IMPORTANT: The user has invoked the \"whatsapp-evolution\" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]\n\n---\nname: whatsapp-evolution\ndescription: WhatsApp messaging via Evolution API — send, receive, analyze, and auto-reply. Use when processing an incoming WhatsApp message delivered via the webhook, replying on Vincent's behalf, liaising with contacts, sending messages, or handling WhatsApp-related tasks. Covers intent analysis, auto-reply rules, escalation (VIP → phone call), conversation context fetching, and draft approvals via the jimmy-wa-drafts sidecar.\n---\n\n# WhatsApp via Evolution API\n\n## Credentials\nStored in `/opt/data/secrets/.whatsapp-evolution-credentials`:\n- `EVOLUTION_API_URL`, `EVOLUTION_API_KEY`, `EVOLUTION_INSTANCE`\n\nLoad with: `export $(cat .whatsapp-evolution-credentials | xargs)`\n\n## Sending Messages (direct, no approval)\n\n```bash\nexport $(cat .whatsapp-evolution-credentials | xargs)\ncurl -s -X POST \"${EVOLUTION_API_URL}/message/sendText/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"number\": \"6591234567\", \"text\": \"Hello!\"}'\n```\n\n- Phone format: country code, no `+` (e.g. `6590054438`)\n- Groups: use JID `{groupid}@g.us`\n- For media/location/contacts — see [references/api.md](references/api.md)\n\nUse direct send only when Vincent has explicitly commanded/approved the exact send, or for pre-approved auto-replies/follow-ups. For any reply that needs Vincent's review, use the sidecar (next section).\n\n### Direct replies when Vincent names an existing WhatsApp thread\n\nIf Vincent says to “reply to X” or “confirm with X” and has clearly approved the action, but the contact JID/number is not in the current prompt:\n\n1. Search recent session history first (`session_search`) for the contact name, clinic/business name, or appointment details. Prior webhook summaries often contain the exact `remoteJid`, alternate phone JID, sender name, and context.\n2. If session history identifies the WhatsApp phone JID, send directly only if Vincent’s instruction is explicit enough to act. Keep the text concise and natural.\n3. If only a LID JID is found, look for `remoteJidAlt`/alternate phone JID in the session summary or fetch contacts/chats before sending.\n4. Do not invent phone numbers from web search when replying to an existing WhatsApp conversation. Use web search only as supporting context, not as the destination, unless Vincent explicitly asks to contact a public number.\n\n\n### Sending trip/camp details from retrieved emails\n\nWhen Vincent explicitly asks to send extracted school trip/camp details to a child or family member, send directly. Keep it practical and grounded:\n- Include dates, reporting/pick-up times, flights, venue/itinerary highlights, deadlines, forms, and safety/medication notes when found.\n- If an official packing list is missing, say that clearly. Do not invent one. You may add a short “based on itinerary/weather, prepare…” list, clearly labeled as practical guidance rather than official school instructions.\n- Avoid long explanations about the search process in WhatsApp. Send the useful details.\n\n### Sending a local image/file as media\n\nFor user-approved media sends, base64-encode the local file and call `sendMedia`:\n\n```bash\nexport $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\npython3 - <<'PY'\nimport base64, json\npath = '/tmp/file.png'\npayload = {\n 'number': '6591234567',\n 'mediatype': 'image',\n 'mimetype': 'image/png',\n 'caption': 'Caption here',\n 'media': base64.b64encode(open(path, 'rb').read()).decode(),\n 'fileName': 'file.png',\n}\nopen('/tmp/wa_media_payload.json', 'w').write(json.dumps(payload))\nPY\ncurl -s --max-time 60 -X POST \"${EVOLUTION_API_URL}/message/sendMedia/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_media_payload.json\n```\n\nWhen sending QR tickets or passes, verify the crop/image contains the full QR code and the intended attendee name before sending; the QR screenshot is effectively the redeemable ticket.\n\n## Fetching Conversation Context\n\nBefore drafting or proposing any reply, always fetch recent chat history for context:\n\n```bash\npython3 /opt/data/scripts/wa_recent_messages.py \"6591234567@s.whatsapp.net\" 30\n```\n\nUse `limit=30` by default. For active group chats or ambiguous situations, fetch up to 50 messages. Read enough context to understand who is being discussed, current tone, previous commitments, and whether Vincent has already replied. The script returns newest messages first (descending order).\n\n## Draft Approvals — `jimmy-wa-drafts` sidecar\n\nWhen you need Vincent to approve a WhatsApp reply, POST the draft to the sidecar. It owns the entire approval flow: Telegram message with 4 inline buttons, Telegram WebApp for Edit/Context, sending to WhatsApp via Evolution API on Approve.\n\n### Posting a draft\n\nAlways include an `extras` object with the original WA context. The sidecar passes `extras` through verbatim if Vincent later clicks \"Context\" — without it, your redraft will be flying blind.\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\n \"contact\": \"120363351787895353@g.us\",\n \"contact_name\": \"Cycling group\",\n \"summary\": \"Karen sent a blessing over the bike ride; she's not joining\",\n \"draft_text\": \"Thanks Karen, will miss you both — praying for a good time too 🙏\",\n \"extras\": {\n \"group_jid\": \"120363351787895353@g.us\",\n \"sender_name\": \"Karen Tan\",\n \"sender_jid\": \"6597630553@s.whatsapp.net\",\n \"category\": \"prayer/blessing\",\n \"original_message\": \"Happy cycling, to those who can make it… Dn Jeff and I had made plans for this PM so we won't be able to join y'all. Praying the Lord's favor, protection and good-weather blessings be upon y'all and your route!\",\n \"extracted\": {\n \"what\": \"Cycling from East Coast Park to Changi Village, then dinner\",\n \"when\": \"Today, 4:00 PM\",\n \"where\": \"East Coast Park to Changi Village\",\n \"who\": \"Group members who voted yes for cycling/dinner or dinner\"\n },\n \"confidence\": \"high\",\n \"why_appropriate\": \"Multiple group messages are blessings/wishes; Vincent has not replied; tone matches grace-based norm.\",\n \"wa_message_id\": \"ABC123XYZ\"\n }\n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\nResponse: `{\"draft_id\":\"abc123def\",\"telegram_message_id\":42}`\n\nWhat to put in `extras` (use whatever makes sense for the message type):\n\n- **Always**: `original_message` (verbatim WA text), `sender_name`, `category` (e.g. `prayer`, `birthday`, `event`, `dm-reply`, `enquiry`)\n- **Group messages**: `group_jid`, `sender_jid`\n- **DMs**: `sender_jid`\n- **Strongly recommended (drives the rich Telegram render)**: `extracted` (dict of who/what/when/where), `confidence` (`high`/`medium`/`low`), `why_appropriate` (one-line reasoning shown to Vincent so he can decide)\n- **Optional**: `wa_message_id`, `tone_hint` (e.g. `warm-grace-based`), `sender_phone` (auto-derived from `sender_jid` if omitted)\n\nVincent receives a Telegram message in the `@jimmy_WA_bot` chat with 4 inline buttons. The sidecar handles every interaction; **you do not track replies, callbacks, or message IDs**. Just POST and move on.\n\n| Button | Behavior |\n|--------|----------|\n| **Approve** | Sidecar sends draft to WhatsApp |\n| **Edit** | Opens a Telegram WebApp with draft pre-filled; Vincent edits + sends; sidecar pushes to WhatsApp |\n| **Context** | Opens a Telegram WebApp with empty textarea + the original WA message visible; Vincent provides context. (When `HERMES_REDRAFT_WEBHOOK` is wired in the sidecar's env, this will POST back to Jimmy for a redraft. Until then, the context is just stored on the draft.) |\n| **Pass** | Sidecar deletes the draft |\n\n### JSON escaping when `draft_text` contains quotes\n\nSingle quotes (`It's`, `God's`) break shell interpolation. Write the JSON to a temp file first:\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\"contact\":\"6591234567@s.whatsapp.net\",\"contact_name\":\"John\",\"summary\":\"…\",\"draft_text\":\"It's nice to hear from you!\"}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\n### Networking\n\n`https://jimmy-draft.huper.asia` is the public HTTPS endpoint of the sidecar (TLS-terminated by Traefik on ubuntu-server). HTTPS is required because Hermes blocks plain-HTTP commands via tirith. The internal Docker DNS hostname http://jimmy-wa-v64mqe:8080 also works for non-shell calls but the agent is forbidden from using it.\n\n### Sidecar status & logs\n\n- Source: (auto-deploys on push)\n- Public URL: (`/health` for ping)\n- Container: `jimmy-wa-v64mqe.1.` on ubuntu-server\n- Bot: `@jimmy_WA_bot`\n\n## Handling a Redraft Trigger (`wa-redraft` webhook)\n\nWhen Vincent clicks the **Context** button on a draft, the sidecar:\n1. Edits the original Telegram message to \"🔄 Adding context for X… agent is redrafting.\" (removes the old buttons so Vincent can't approve the now-stale text)\n2. POSTs to your `wa-redraft` webhook (`http://localhost:8644/webhooks/wa-redraft`) with the original draft fields plus Vincent's `context` instructions and the `extras` you stored when you first created the draft.\n\nWhen you receive a `wa-redraft` activation, the prompt template will lay out the fields. Your job:\n\n1. Read Vincent's `context` carefully — that's the steering signal. The `previous_draft` is just history, not necessarily a starting point.\n2. Re-fetch recent WA history if the original context isn't enough (`python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`). Use `extras.group_jid` for groups or `contact` for DMs.\n3. Write a NEW draft that addresses Vincent's instructions while staying faithful to the original WA context (in `extras.original_message`, `extras.extracted`).\n4. POST the new draft back to the sidecar **with the same `draft_id`** so the existing draft is updated, not duplicated:\n\n```bash\ncat > /tmp/wa_redraft.json <\",\n \"contact\": \"\",\n \"contact_name\": \"\",\n \"summary\": \"\",\n \"draft_text\": \"\",\n \"extras\": \n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_redraft.json\n```\n\nThe sidecar will send a NEW Telegram message with 4 fresh buttons. Vincent can Approve, Edit, Context-again (loop), or Pass.\n\nDo **not** send to WhatsApp directly during a redraft — the approval flow is the sidecar's job.\n\n## Incoming Message Processing\n\n### Delivery Pipeline (current)\n1. Evolution API POSTs each `MESSAGES_UPSERT` webhook directly to Hermes at `http://localhost:8644/webhooks/whatsapp-incoming` (proxied via `jimmy.huper.asia/webhooks/whatsapp-incoming`)\n2. Hermes webhook route (`whatsapp-incoming`, dynamic subscription) validates and spawns an agent\n3. Agent loads `whatsapp-evolution` skill, follows the rules below\n4. Webhook subscription uses `deliver: log` because webhooks do not support `deliver: local`; the agent explicitly calls `send_message(target=\"telegram\", ...)` only for important/actionable/VIP/event/birthday/prayer/error items, OR posts a draft to the sidecar for approval items.\n\n**No more OpenClaw, no Python webhook server, no port 5001, no keepalive cron.** If messages aren't being processed, check: nginx proxy to port 8644, `hermes webhook list` for the subscription, Evolution API webhook URL setting.\n\n## Contact Rules & Escalation\n\nConfig: `memory/whatsapp-vip-contacts.json`\n\n### Priority Levels\n\n| Priority | Action | Example |\n|----------|--------|---------|\n| **VIP (call-always)** | **CALL Vincent immediately via Twilio without asking.** Do NOT ask for approval first. During quiet hours (1am-10am SGT) suppress call and use Telegram notification only, unless life-and-death emergency. Match VIPs by both phone JID (`659...@s.whatsapp.net`) and LID JID (`...@lid`) when present, because recent chats may arrive with `key.remoteJid` as an LID and the phone number only in `key.remoteJidAlt`. | Dn Jeff (+6597630553) |\n| **Family** | Always notify Vincent on Telegram | ps112 group, family members |\n| **Auto-reply** | Reply based on context file, notify Vincent | Repair shops |\n| **Normal** | Analyze intent, reply if appropriate, notify if important | Everyone else |\n| **Broadcast/Spam** | Ignore silently | Forwarded job posts, chain messages |\n\n### Quiet Hours\n- 1am-10am SGT (17:00-02:00 UTC): avoid phone calls, including for Dn Jeff / VIP call-always contacts\n- **Exception:** life-and-death emergencies always call regardless of quiet hours\n- During quiet hours, use Telegram notification only and mention that a call was suppressed due to quiet hours\n\n## Making a Call via Twilio\n\nCredentials: `/opt/data/secrets/.twilio-credentials-new` (`TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN`, `TWILIO_PHONE_NUMBER`). From: `+16614809287`, Vincent: `+6590054438`.\n\nPreferred script: `/opt/data/scripts/call_vincent.py`, now using the Twilio Python SDK baked into the container.\n\n```bash\npython3 /opt/data/scripts/call_vincent.py \"Hello Vincent, this is Jimmy. Urgent summary here.\"\n```\n\n## Action Rules\n\n### Default: DO NOT auto-reply directly, and do not spam main Telegram\nCollect intent and stay local unless something truly needs Vincent's immediate attention.\n\n**When you need Vincent to approve a reply, POST a draft to the sidecar (see \"Draft Approvals\" above).** The sidecar handles the Telegram notification with buttons, the WebApp for editing, and sending to WhatsApp on approve. Do **not** duplicate that draft or summary in the main Telegram chat.\n\nMain Telegram WhatsApp summaries are muted by default. Only use `send_message(target=\"telegram\", ...)` for urgent/important/actionable items that need Vincent now: VIP actionable messages/calls, missed/ringing family calls, emergencies, safety/medical/school-pickup issues, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors. Routine logistics, casual family pings, generic group events, links, standalone codes, normal confirmations, and suggested replies should be local-only or sidecar-only.\n\nDo NOT send WhatsApp replies directly unless Vincent explicitly approves them or the category is pre-approved.\n\n### Tone (when proposing or sending replies)\n- Casual, direct, no sign-offs (no \"Thanks & Regards\", no \"- Vincent\")\n- Ask questions instead of giving instructions\n- Keep it natural — like texting a friend\n- Christian and grace-based: warm, humble, encouraging, non-judgmental, peace-making\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- For prayer/birthday contexts, use gentle faith-filled language; avoid sounding preachy, performative, or overly formal\n\n### What to Do With Each Message\n\n**DMs (personal chats):**\n- Analyze intent\n- Notify Vincent on Telegram (via `send_message(target=\"telegram\", ...)`) only if important/actionable/time-sensitive, requires a decision, comes from a VIP contact\n- If you have a useful suggested reply, POST it as a draft to the sidecar instead of a Telegram notification — Vincent gets the buttons inline\n- Casual/social/non-actionable DMs → local log only, no Telegram notification, no draft\n- Do NOT reply unless Vincent says to\n- VIP contacts → **call Vincent immediately without asking**. Do NOT post a draft. Just call. During quiet hours (1am-10am SGT), suppress the call and send Telegram notification only, unless life-and-death emergency.\n\n**Group chats — events/meetups:**\n- If someone posts about an event, gathering, meetup, dinner, etc:\n 1. Extract: what, when, where, who\n 2. Notify Vincent on Telegram with details (no draft needed unless you also want to suggest an RSVP reply)\n 3. Ask: \"Want me to add this to Google Calendar and set a reminder?\"\n\n**Group chats — prayer requests / praying for others:**\n- If someone asks for prayer, shares a burden, illness, grief, struggle, thanksgiving, or asks the group to pray:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Identify who/what to pray for and any sensitive details\n 3. Draft a short, grace-based Christian reply/prayer\n 4. POST it to the sidecar with `summary` describing the request and `draft_text` being the prayer\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — birthday greetings:**\n- If you see birthday wishes for someone in a group:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Note who the birthday person is and relationship/context if inferable\n 3. Draft a warm, grace-based Christian birthday greeting\n 4. POST it to the sidecar\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — general:**\n- Casual banter, memes, forwards, broadcast spam → ignore silently\n- Religious content shares → ignore unless specifically addressed to Vincent\n\n### When to Notify Vincent (Telegram) vs Post a Draft\n\n**Use `send_message(target=\"telegram\", ...)` sparingly** for plain notifications only when the message needs Vincent's immediate attention: VIP actionable DMs/calls, emergencies, school pickup/child logistics requiring action now, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors.\n\n**Use the sidecar (POST `/drafts`)** whenever you have a suggested reply for Vincent to approve: prayer responses, birthday greetings, DM replies, group RSVPs. Do not also send the same summary to the main Telegram chat.\n\nDo **not** notify or draft for casual/social/non-actionable DMs, routine logistics, read receipts/status updates, religious shares not addressed to Vincent, memes, forwarded/broadcast/spam, generic group events, standalone codes, normal confirmations, or general group chatter. Family messages are not automatically main-chat alerts unless urgent, safety-related, school/pickup critical, or explicitly requiring Vincent now.\n\n## Intent Categories\n\nWhen analyzing incoming messages, classify as:\n\n1. **Urgent** — time-sensitive, emergencies → call Vincent outside quiet hours; during quiet hours use Telegram only unless life-and-death\n2. **Important** — decisions, money, appointments, anything consequential → call Vincent outside quiet hours; during quiet hours notify on Telegram only unless life-threatening\n3. **Request** — asking Vincent to do something → **notify Vincent** (Telegram) or post a draft if you have a safe reply\n4. **Enquiry** — asking for info, pricing, availability → auto-reply if context available, otherwise post a draft\n5. **Confirmation** — confirming plans, bookings → acknowledge and notify Vincent\n6. **Casual/Social** — greetings, memes, forwards → usually ignore or react\n7. **Broadcast/Spam** — forwarded job posts, chain messages → ignore\n\n## Finding Individual Contacts and JIDs\n\nWhen Vincent asks to monitor or prioritize a specific WhatsApp contact and only gives a name:\n\n1. Query contacts and chats, then search both display names and serialized message metadata:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n python3 - <<'PY'\n import os, requests, json\n base=os.environ['EVOLUTION_API_URL']; inst=os.environ['EVOLUTION_INSTANCE']; key=os.environ['EVOLUTION_API_KEY']\n headers={'apikey':key,'Content-Type':'application/json'}\n terms=['whye','keong'] # replace with lowercase search terms\n for ep in ['findContacts','findChats']:\n r=requests.post(f'{base}/chat/{ep}/{inst}', headers=headers, json={}, timeout=30)\n print('endpoint', ep, 'status', r.status_code)\n for obj in (r.json() if r.ok else []):\n if any(t in json.dumps(obj, ensure_ascii=False).lower() for t in terms):\n print(json.dumps(obj, ensure_ascii=False)[:1000])\n PY\n ```\n2. If `findChats` shows `remoteJid` ending in `@lid`, inspect `lastMessage.key.remoteJidAlt` for the phone JID. Store both in `memory/whatsapp-vip-contacts.json` when configuring VIP/call rules.\n3. Verify the number exists with `/chat/whatsappNumbers/{instance}` before finalizing.\n4. Validate the JSON config after editing: `python3 -m json.tool /opt/data/agent-state/memory/whatsapp-vip-contacts.json >/tmp/wa_vip_validated.json`.\n\n## Finding Group JIDs\n\nThe `group/fetchGroups` endpoint returns 404 on some Evolution API instances (including ours). To find a group JID when you don't have it:\n\n1. **Use `chat/findMessages` with an empty `where` filter** to pull recent messages across all chats, then extract unique `remoteJid` values containing `@g.us`.\n2. **Match by content**: Fetch messages from each group JID using `where.key.remoteJid` and match sender names or message text to identify the target group.\n3. **Known group JIDs** (store in `whatsapp-vip-contacts.json` as you discover them):\n - ps112 family group: `6591887488-1464061170@g.us`\n - Overcomers group: `6591444911-1614821287@g.us`\n - 神的华美 group: `6596756578-1405873840@g.us` (content-sharing, casual/broadcast, local log only)\n - Overcomers/TCians group: `120363230236878544@g.us` (church fellowship, PJ sends weekly attendance reports, routine attendance = local log only)\n\nExample search:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"6591444911-1614821287@g.us\"}}, \"limit\": 10, \"order\": \"desc\"}'\n```\n\n## Processing Image/Media Messages\n\nWhen an incoming WhatsApp message is an image (imageMessage), video, or other media with no text or only a caption:\n\n1. **Fetch conversation context** as usual with `wa_recent_messages.py`\n2. **Download the media** using the Evolution API:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"message\": {\"key\": {\"id\": \"MESSAGE_ID\", \"fromMe\": false, \"remoteJid\": \"GROUP_OR_CHAT_JID\"}}, \"convertToMp4\": false}' \\\n -o /tmp/wa_media_response.json\n ```\n Then decode: `python3 -c \"import json,base64; d=json.load(open('/tmp/wa_media_response.json')); open('/tmp/wa_media.jpg','wb').write(base64.b64decode(d['base64']))\"`\n3. **Analyze the image** using `vision_analyze` tool with `/tmp/wa_media.jpg` path and a context-aware question\n4. **Fallback if vision is unavailable**: Use `tesseract /tmp/wa_media.jpg stdout` for OCR, combined with pixel analysis (PIL/numpy) and conversation context to infer what the image contains\n5. **Album messages** (`albumMessage`): These bundle multiple images. Each image arrives as a separate `imageMessage` with a `messageAssociation.parentMessageKey` linking them to the album. Download each image separately. The `expectedImageCount` field tells how many images to expect\n6. **messageAssociation**: When images have `parentMessageKey` with `fromMe: true`, they are replying to a message Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an incoming WhatsApp message is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification when possible:\n\n1. **Fetch conversation context first**, especially in groups:\n ```bash\n python3 /opt/data/scripts/wa_recent_messages.py \"\" 50\n ```\n2. **Download the audio** with Evolution API, saving the response to a file:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_audio_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"GROUP_OR_CHAT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_audio_req.json -o /tmp/wa_audio_response.json\n python3 - <<'PY'\n import json, base64\n d=json.load(open('/tmp/wa_audio_response.json'))\n b=d.get('base64') or d.get('data',{}).get('base64')\n if ',' in b[:100]: b=b.split(',',1)[1]\n open('/tmp/wa_audio.ogg','wb').write(base64.b64decode(b))\n PY\n ```\n3. **Convert OGG/Opus to WAV**:\n ```bash\n ffmpeg -y -i /tmp/wa_audio.ogg -ar 16000 -ac 1 /tmp/wa_audio.wav\n ```\n4. **Transcribe fallback when Whisper is unavailable**: the container may not have `whisper`, `faster-whisper`, `torch`, or `speech_recognition` installed globally. A reliable lightweight fallback is a temp venv with SpeechRecognition and Google web speech:\n ```bash\n python3 -m venv /tmp/srvenv\n /tmp/srvenv/bin/pip install --quiet SpeechRecognition\n /tmp/srvenv/bin/python - <<'PY'\n import speech_recognition as sr\n r=sr.Recognizer()\n with sr.AudioFile('/tmp/wa_audio.wav') as source:\n audio=r.record(source)\n print(r.recognize_google(audio, language='en-SG'))\n PY\n ```\n5. **Check adjacent voice notes from the same sender**. WhatsApp voice notes are often sent in quick sequential chunks. Download and transcribe nearby `audioMessage` records from the same sender before deciding whether to notify Vincent.\n6. Use the transcript plus recent context to classify the message. For group meetup planning, extract what/when/where/who and notify Vincent if actionable, including transcript confidence.\n\n### Video Messages\n\nWhen an incoming WhatsApp message is `videoMessage` (with or without a caption):\n\n1. **Fetch conversation context** first, as with any media message: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n2. **Download the video** using Evolution API (same pattern as images/audio — save response to file, then decode):\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_video_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"CONTACT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_video_req.json -o /tmp/wa_video_response.json\n python3 - <<'PY'\n import json, base64\n d = json.load(open('/tmp/wa_video_response.json'))\n b = d.get('base64') or d.get('data', {}).get('base64', '')\n if ',' in b[:100]: b = b.split(',', 1)[1]\n open('/tmp/wa_video.mp4', 'wb').write(base64.b64decode(b))\n print(f\"Video saved: {len(base64.b64decode(b))} bytes\")\n PY\n ```\n3. **Extract representative frames** with ffmpeg (do NOT try to analyze video directly — vision tools work on images):\n ```bash\n # Frame at start\n ffmpeg -y -ss 0 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame0.jpg\n # Frame at ~1/3 of the way (good for short videos)\n ffmpeg -y -ss 3 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame3s.jpg\n # If video is longer (>15s), also extract at midpoint\n # ffprobe -v quiet -print_format json -show_format /tmp/wa_video.mp4 # get duration\n # ffmpeg -y -ss $HALF_DURATION -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame_mid.jpg\n ```\n4. **Analyze the frames** with `vision_analyze`. Ask specific questions about content: what objects/people/text are visible, what type of video this appears to be (screen recording, personal clip, forwarded meme, etc.), and any text or identifying details.\n5. **Use the caption** from `videoMessage.caption` if present (can be empty/null). Combine frame analysis + caption + conversation context to classify the message intent.\n6. **Transcription of audio in video** is generally not feasible with available tools. If the video likely contains important spoken content (e.g. voice memo style), note this limitation and flag it to Vincent.\n\n**Key points:**\n- Always use `-o` flag and file-based decode; videos can be 1-10MB and base64 responses will be even larger\n- `--max-time 25` on curl is essential for video downloads\n- Short casual clips (showing a key fob, a parking spot, quick view of surroundings) are usually casual/social — classify accordingly and don't over-escalate\n- If `vision_analyze` fails (400 on models that don't support images, or 429 rate limit), fall back to conversation context inference alone\n\n### Poll Messages\nWhen `pollUpdateMessage` appears in chat history (someone voted), the message contains a `pollCreationMessageKey` with the original poll's message `id`. To get the poll question and options, query:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"id\": \"POLL_MESSAGE_ID\", \"remoteJid\": \"GROUP_JID\"}}, \"limit\": 1}'\n```\nThe response will have `pollCreationMessageV3` with `name` (the poll question) and `options` array. Always look up the poll when votes or poll references appear, since the poll text contains the event details.\n\n### `associatedChildMessage` (Motion Photos / Companion Media)\nWhatsApp motion photos and companion media arrive as `associatedChildMessage` type. These are short videos or supplementary images attached to a parent message via `messageAssociation.parentMessageKey`. The structure is:\n\n```\nmessage: {\n messageContextInfo: {\n messageAssociation: {\n associationType: 12,\n parentMessageKey: { id, fromMe, remoteJid }\n }\n },\n associatedChildMessage: {\n message: {\n videoMessage: { ... } // or imageMessage: { ... }\n }\n }\n}\n```\n\nKey points:\n- The `parentMessageKey.id` links back to the main message (the image/video they accompany)\n- When `parentMessageKey.fromMe: true`, the companion media is replying to a message Vincent sent\n- Captions may appear on the child's `videoMessage.caption` or `imageMessage.caption` inside the nested message, NOT at the top level\n- These often contain motion-photo short videos (3 seconds) or album companion images\n- Treat them as media attachments to the parent; don't process them as standalone messages requiring independent intent analysis\n\n### Sequential Image Sets (Non-Album)\nMultiple images sent rapidly by the same sender at similar timestamps may NOT be wrapped in an `albumMessage`. They appear as separate `imageMessage` entries. Always check for adjacent images from the same sender and download+OCR all of them together, as they often form a coherent multi-page set (e.g., route info, comparison tables, maps). Missing one can lose important context.\n\n### Key pitfalls\n- The Evolution API `getBase64FromMediaMessage` endpoint returns the full base64 payload (can be 100KB+ for images). Use `-o` flag to save to file, then parse with Python. Do NOT try to pipe it directly\n- `curl` with `--max-time 25` is essential; without it, large media responses can hang\n- The `albumMessage` type itself has no `fromMe` field; check each individual `imageMessage` instead\n- Image thumbnails in the webhook payload are truncated; always download the full image for analysis\n- **WhatsApp media URLs expire**: CDN download links (`mmg.whatsapp.net`) expire after roughly 24-48 hours. When `getBase64FromMediaMessage` returns a 400/404 error like \"Failed to fetch stream from https://mmg.whatsapp.net/...\", the media is no longer retrievable via the API. This means you CANNOT download adjacent/older voice notes, images, or documents for context if they're more than a day or two old. Fall back to conversation text context only. Fresh/incoming messages are almost always still downloadable.\n- **Voice note transcription reliability**: Short voice notes (<15s), especially in Singlish/Chinese-English mix common in Singapore contacts, often produce unreliable transcriptions via SpeechRecognition+Google. Always flag low-confidence transcriptions to Vincent rather than acting on potentially garbled text.\n- **Vision API failures**: `vision_analyze` can fail with 429 (rate limit) **or 400** (model doesn't support image input, e.g. glm-5.1 via ollama-cloud). Always have a fallback plan. When vision is unavailable, try tesseract OCR, then pixel analysis (PIL/numpy for brightness zones, text region detection via row-level dark-pixel density)\n- **OCR can fail on decorative/complex images**: Obituaries, memorial notices, religious flyers with ornate fonts, or dark-background images often defeat tesseract even with aggressive preprocessing (upscaling, thresholding, inverting, PSM mode variations). When OCR returns gibberish or empty, fall back to conversation context inference: use preceding/surrounding messages to determine what the image likely contains, then draft an appropriate response based on that context\n- **OCR on devotional/dark-background images**: These often defeat tesseract on full-image scans. Use a multi-strategy approach: (1) run tesseract on the full image first, (2) if text is sparse, crop into regions (top 15%, middle, bottom 15%), upscale 2-3x, enhance contrast (2-3x), and try both normal and inverted (`ImageOps.invert`) versions with `--psm 6`, (3) combine all partial OCR results. For Exodus/verse images, the title (e.g. \"WHEN MOSES' HANDS WERE\") is often in decorative font that OCR misses, while the verse body text is usually readable.\n- **Prioritize conversation context**: When processing group images, always fetch recent context FIRST (before attempting vision/OCR). If image analysis fails, the conversation history often makes the image's purpose clear enough to draft a meaningful reply\n- **DM conversation context is one-sided (fromMe only)**: The Evolution API `findMessages` endpoint often returns only Vincent's sent messages (`fromMe: true`) for DM chats, not the contact's incoming messages. This means `wa_recent_messages.py` can show only one side of the conversation. When processing a DM webhook, do NOT rely solely on fetched history for the contact's side. The webhook payload already contains the incoming message text. Use what the webhook gives you plus Vincent's side from the API. If you need the contact's recent messages, check if `fromMe: false` records exist; if none are returned, the API simply doesn't have them and you must work with what you have (webhook message + Vincent's replies for context).\n- **`findMessages` ignores `key.fromMe` filter**: Passing `\"key.fromMe\": false` in the `where` clause still returns all messages (including `fromMe: true`). To find messages from a specific sender, fetch results and filter client-side by checking `fromMe` in Python.\n- **`findMessages` caps at ~50 records per request**: Even with `limit: 200` or `limit: 500`, the API returns at most 50 records. When one party sends many consecutive messages, the other party's messages may not appear in the first 50 results. Use pagination or fetch multiple pages if you need older messages.\n- **Video frame extraction**: Use `ffmpeg -y -ss N -i video.mp4 -frames:v 1 -update 1 output.jpg` (the `-update 1` flag is needed; without it ffmpeg treats the output as an image sequence and fails). Extract frames at multiple timestamps (start, ~1/3, midpoint for longer videos) since a single frame may miss the key content.\n- **ffmpeg \"Invalid color space\" failure on WhatsApp videos**: Some WhatsApp H.264 videos have `color_space=reserved` and `color_primaries=reserved` metadata that ffmpeg cannot decode at all. Every codec/format/filter combination fails with \"Invalid color space\" / return code 234. This is NOT fixable with `-vf setparams`, `-color_trc`, `-colorspace`, `-color_primaries`, or `-vf format=` flags because the error occurs at the decoder level before any filter runs. **Do not waste time trying ffmpeg workarounds.** When this happens, fall back to: (1) conversation context inference (who sent it, what thread it's in, adjacent messages), (2) the jpegThumbnail embedded in the videoMessage (small but gives a rough idea; base64-encoded JPEG, typically 600-900 bytes, use `vision_analyze` on it if available), or (3) if all analysis fails, just classify based on surrounding text context.\n- **`wa_recent_messages.py` JSON output breaks `json.loads()`**: The script's stdout contains WhatsApp message content with control characters (emoji, CJK, deviceListMetadata binary blobs) that cause `json.loads()` to fail with `Invalid control character` or `Expecting ',' delimiter` errors. Even `strict=False` does not reliably fix it. **Reliable approach**: use `curl -o /tmp/wa_msgs.json` to save the Evolution API response directly to a file (the file write handles binary safely), then parse with `json.load(open('/tmp/wa_msgs.json'))` in Python. Example:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 15 -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"CONTACT_JID\"}}, \"limit\": 20}' -o /tmp/wa_msgs.json\n ```\n Then in Python: `data = json.load(open('/tmp/wa_msgs.json')); records = data['messages']['records']`\n\n## API Reference\n\nFor full API endpoints (media, contacts, location, buttons) — see [references/api.md](references/api.md).\n\n[Skill directory: /opt/data/skills/personal/whatsapp-evolution]\nResolve any relative paths in this skill (e.g. `scripts/foo.js`, `templates/config.yaml`) against that directory, then run them with the terminal tool using the absolute path.\n\n[This skill has supporting files:]\n- references/api.md -> /opt/data/skills/personal/whatsapp-evolution/references/api.md\n- scripts/process_queue.py -> /opt/data/skills/personal/whatsapp-evolution/scripts/process_queue.py\n- scripts/send_text.sh -> /opt/data/skills/personal/whatsapp-evolution/scripts/send_text.sh\n\nLoad any of these with skill_view(name=\"personal/whatsapp-evolution\", file_path=\"\"), or run scripts directly by absolute path (e.g. `node /opt/data/skills/personal/whatsapp-evolution/scripts/foo.js`).\n\nThe user has provided the following instruction alongside the skill invocation: Incoming WhatsApp webhook received.\n\nEvent: messages.upsert\nInstance: n8n-WA\nSender: Joelle\nSender JID: 120363418226875471@g.us\nFrom me: False\nMessage type: reactionMessage\nMessage: {\n \"event\": \"messages.upsert\",\n \"instance\": \"n8n-WA\",\n \"data\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": false,\n \"id\": \"3A6C4D378FD5B1F81DBD\",\n \"participant\": \"192260067090531@lid\",\n \"participantAlt\": \"6594519406@s.whatsapp.net\",\n \"addressingMode\": \"lid\"\n },\n \"pushName\": \"Joelle\",\n \"status\": \"DELIVERY_ACK\",\n \"message\": {\n \"reactionMessage\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": false,\n \"id\": \"ACAF388AF06A2F0E1C1777A7917ED565\",\n \"participant\": \"207726193995780@lid\"\n },\n \"text\": \"\\u2764\\ufe0f\",\n \"senderTimestampMs\": {\n \"low\": -1595802564,\n \"high\": 414,\n \"unsigned\": false\n }\n }\n },\n \"messageType\": \"reactionMessage\",\n \"messageTimestamp\": 1780815625,\n \"instanceId\": \"b2bfaa30-d5f1-4b22-b979-791892619414\",\n \"source\": \"ios\"\n },\n \"destination\": \"https://jimmy.huper.asia/webhooks/whatsapp-incoming\",\n \"date_time\": \"2026-06-07T04:00:26.415Z\",\n \"sender\": \"6590054438@s.whatsapp.net\",\n \"server_url\": \"https://evolution.huper.asia\",\n \"apikey\": \"5E4C532223A0-460B-A749-16C584F56E74\"\n}\n\nPrimary goal: REDUCE NOISE. Stay silent in the main Telegram chat unless the message truly needs Vincent's immediate attention. Approval-worthy reply drafts should go to the jimmy-wa-drafts sidecar, NOT to the main Telegram chat.\n\nFollow the whatsapp-evolution skill and /opt/data/agent-state/memory/whatsapp-vip-contacts.json, with these stricter notification rules:\n\nABSOLUTE SILENCE RULES:\n- If fromMe is true: ignore. Do NOT call send_message. Final response local log only.\n- Ignore status/update/read-receipt events or payloads with no real message content.\n- Do NOT send main Telegram summaries for ordinary DMs, routine logistics, casual family messages, generic group events, group chatter, forwards, memes, devotional/religious shares, links, standalone codes, normal confirmations, or suggested replies.\n- Do NOT ask \"Want me to add this to Calendar?\" for WhatsApp group events unless Vincent is explicitly invited, named, responsible, or the event is time-sensitive and clearly relevant to him today.\n- Do NOT surface family messages just because they are family. Only surface if urgent, safety-related, school/pickup critical, money/legal/medical, or explicitly requires Vincent now.\n\nSIDEcar RULES:\n- If a reply may be useful but it is not urgent, POST it to https://jimmy-draft.huper.asia/drafts using the whatsapp-evolution Draft Approvals flow. Do NOT duplicate that draft/summary in the main Telegram chat.\n- Prayer and birthday replies should go to the sidecar when appropriate, not the main Telegram chat, unless the underlying situation is urgent/sensitive and Vincent needs immediate awareness.\n- Before drafting any reply or approval suggestion, fetch recent context: python3 /opt/data/scripts/wa_recent_messages.py \"120363418226875471@g.us\" 30. For active/ambiguous groups use 50.\n\nMAIN TELEGRAM ALERTS ARE ALLOWED ONLY FOR:\n1. VIP contacts marked always_call or always_notify when the message is important/actionable, or any VIP call event.\n2. Calls from family/VIP that appear missed/ringing and likely need immediate attention. Avoid repeated alerts for accepted/terminated duplicate call state updates.\n3. Emergencies, safety issues, medical issues, school pickup/child logistics requiring action now, legal/money/account/security matters, or time-sensitive commitments in the next few hours.\n4. Messages explicitly asking Vincent to call back, decide now, approve payment, provide OTP/code for an active task, or respond urgently.\n5. Pipeline errors or webhook processing failures.\n\nCALL RULES:\n- VIP contacts (e.g. Dn Jeff, Ng Whye Keong): call Vincent immediately outside quiet hours when actionably important or on ringing/missed call. During 1am-10am SGT, Telegram only unless life-and-death emergency.\n- Do not call for casual VIP confirmations or non-actionable courtesy messages.\n\nTONE/RULES FOR DRAFTS:\n- Proposed replies must be Christian and grace-based where appropriate: warm, humble, encouraging, non-judgmental, peace-making, not preachy.\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- NEVER auto-reply directly on WhatsApp unless Vincent explicitly approves or future config marks that exact category pre-approved.\n\nDelivery rule:\n- For allowed main alerts only: call send_message(target=\"telegram\", message=...). Keep it concise.\n- For non-urgent reply suggestions: use the sidecar only.\n- For everything else: DO NOT call send_message. Keep final response as local-only log.\n", "timestamp": 1780815635.6791086, "anchor": true}, {"id": 104141, "role": "assistant", "content": "This is a reaction message (❤️) from Joelle in a group chat. It contains no actionable content, just an emoji reaction to another message. Ignoring silently per noise reduction rules.\n\n**Local log:** Joelle reacted ❤️ in group `120363418226875471@g.us` at 2026-06-07T04:00:26Z. No action needed.", "timestamp": 1780815635.7016711}], "bookend_end": [], "messages_before": 0, "messages_after": 2}, {"session_id": "20260607_063521_c45bd6ec", "when": "June 07, 2026 at 06:35 AM", "source": "webhook", "model": "glm-5.1", "title": null, "matched_role": "user", "match_message_id": 104137, "snippet": "...true`, they are replying to a >>>message<<< Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an >>>incoming<<< WhatsApp >>>message<<< is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification...", "bookend_start": [], "messages": [{"id": 104137, "role": "user", "content": "[IMPORTANT: The user has invoked the \"whatsapp-evolution\" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]\n\n---\nname: whatsapp-evolution\ndescription: WhatsApp messaging via Evolution API — send, receive, analyze, and auto-reply. Use when processing an incoming WhatsApp message delivered via the webhook, replying on Vincent's behalf, liaising with contacts, sending messages, or handling WhatsApp-related tasks. Covers intent analysis, auto-reply rules, escalation (VIP → phone call), conversation context fetching, and draft approvals via the jimmy-wa-drafts sidecar.\n---\n\n# WhatsApp via Evolution API\n\n## Credentials\nStored in `/opt/data/secrets/.whatsapp-evolution-credentials`:\n- `EVOLUTION_API_URL`, `EVOLUTION_API_KEY`, `EVOLUTION_INSTANCE`\n\nLoad with: `export $(cat .whatsapp-evolution-credentials | xargs)`\n\n## Sending Messages (direct, no approval)\n\n```bash\nexport $(cat .whatsapp-evolution-credentials | xargs)\ncurl -s -X POST \"${EVOLUTION_API_URL}/message/sendText/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"number\": \"6591234567\", \"text\": \"Hello!\"}'\n```\n\n- Phone format: country code, no `+` (e.g. `6590054438`)\n- Groups: use JID `{groupid}@g.us`\n- For media/location/contacts — see [references/api.md](references/api.md)\n\nUse direct send only when Vincent has explicitly commanded/approved the exact send, or for pre-approved auto-replies/follow-ups. For any reply that needs Vincent's review, use the sidecar (next section).\n\n### Direct replies when Vincent names an existing WhatsApp thread\n\nIf Vincent says to “reply to X” or “confirm with X” and has clearly approved the action, but the contact JID/number is not in the current prompt:\n\n1. Search recent session history first (`session_search`) for the contact name, clinic/business name, or appointment details. Prior webhook summaries often contain the exact `remoteJid`, alternate phone JID, sender name, and context.\n2. If session history identifies the WhatsApp phone JID, send directly only if Vincent’s instruction is explicit enough to act. Keep the text concise and natural.\n3. If only a LID JID is found, look for `remoteJidAlt`/alternate phone JID in the session summary or fetch contacts/chats before sending.\n4. Do not invent phone numbers from web search when replying to an existing WhatsApp conversation. Use web search only as supporting context, not as the destination, unless Vincent explicitly asks to contact a public number.\n\n\n### Sending trip/camp details from retrieved emails\n\nWhen Vincent explicitly asks to send extracted school trip/camp details to a child or family member, send directly. Keep it practical and grounded:\n- Include dates, reporting/pick-up times, flights, venue/itinerary highlights, deadlines, forms, and safety/medication notes when found.\n- If an official packing list is missing, say that clearly. Do not invent one. You may add a short “based on itinerary/weather, prepare…” list, clearly labeled as practical guidance rather than official school instructions.\n- Avoid long explanations about the search process in WhatsApp. Send the useful details.\n\n### Sending a local image/file as media\n\nFor user-approved media sends, base64-encode the local file and call `sendMedia`:\n\n```bash\nexport $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\npython3 - <<'PY'\nimport base64, json\npath = '/tmp/file.png'\npayload = {\n 'number': '6591234567',\n 'mediatype': 'image',\n 'mimetype': 'image/png',\n 'caption': 'Caption here',\n 'media': base64.b64encode(open(path, 'rb').read()).decode(),\n 'fileName': 'file.png',\n}\nopen('/tmp/wa_media_payload.json', 'w').write(json.dumps(payload))\nPY\ncurl -s --max-time 60 -X POST \"${EVOLUTION_API_URL}/message/sendMedia/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_media_payload.json\n```\n\nWhen sending QR tickets or passes, verify the crop/image contains the full QR code and the intended attendee name before sending; the QR screenshot is effectively the redeemable ticket.\n\n## Fetching Conversation Context\n\nBefore drafting or proposing any reply, always fetch recent chat history for context:\n\n```bash\npython3 /opt/data/scripts/wa_recent_messages.py \"6591234567@s.whatsapp.net\" 30\n```\n\nUse `limit=30` by default. For active group chats or ambiguous situations, fetch up to 50 messages. Read enough context to understand who is being discussed, current tone, previous commitments, and whether Vincent has already replied. The script returns newest messages first (descending order).\n\n## Draft Approvals — `jimmy-wa-drafts` sidecar\n\nWhen you need Vincent to approve a WhatsApp reply, POST the draft to the sidecar. It owns the entire approval flow: Telegram message with 4 inline buttons, Telegram WebApp for Edit/Context, sending to WhatsApp via Evolution API on Approve.\n\n### Posting a draft\n\nAlways include an `extras` object with the original WA context. The sidecar passes `extras` through verbatim if Vincent later clicks \"Context\" — without it, your redraft will be flying blind.\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\n \"contact\": \"120363351787895353@g.us\",\n \"contact_name\": \"Cycling group\",\n \"summary\": \"Karen sent a blessing over the bike ride; she's not joining\",\n \"draft_text\": \"Thanks Karen, will miss you both — praying for a good time too 🙏\",\n \"extras\": {\n \"group_jid\": \"120363351787895353@g.us\",\n \"sender_name\": \"Karen Tan\",\n \"sender_jid\": \"6597630553@s.whatsapp.net\",\n \"category\": \"prayer/blessing\",\n \"original_message\": \"Happy cycling, to those who can make it… Dn Jeff and I had made plans for this PM so we won't be able to join y'all. Praying the Lord's favor, protection and good-weather blessings be upon y'all and your route!\",\n \"extracted\": {\n \"what\": \"Cycling from East Coast Park to Changi Village, then dinner\",\n \"when\": \"Today, 4:00 PM\",\n \"where\": \"East Coast Park to Changi Village\",\n \"who\": \"Group members who voted yes for cycling/dinner or dinner\"\n },\n \"confidence\": \"high\",\n \"why_appropriate\": \"Multiple group messages are blessings/wishes; Vincent has not replied; tone matches grace-based norm.\",\n \"wa_message_id\": \"ABC123XYZ\"\n }\n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\nResponse: `{\"draft_id\":\"abc123def\",\"telegram_message_id\":42}`\n\nWhat to put in `extras` (use whatever makes sense for the message type):\n\n- **Always**: `original_message` (verbatim WA text), `sender_name`, `category` (e.g. `prayer`, `birthday`, `event`, `dm-reply`, `enquiry`)\n- **Group messages**: `group_jid`, `sender_jid`\n- **DMs**: `sender_jid`\n- **Strongly recommended (drives the rich Telegram render)**: `extracted` (dict of who/what/when/where), `confidence` (`high`/`medium`/`low`), `why_appropriate` (one-line reasoning shown to Vincent so he can decide)\n- **Optional**: `wa_message_id`, `tone_hint` (e.g. `warm-grace-based`), `sender_phone` (auto-derived from `sender_jid` if omitted)\n\nVincent receives a Telegram message in the `@jimmy_WA_bot` chat with 4 inline buttons. The sidecar handles every interaction; **you do not track replies, callbacks, or message IDs**. Just POST and move on.\n\n| Button | Behavior |\n|--------|----------|\n| **Approve** | Sidecar sends draft to WhatsApp |\n| **Edit** | Opens a Telegram WebApp with draft pre-filled; Vincent edits + sends; sidecar pushes to WhatsApp |\n| **Context** | Opens a Telegram WebApp with empty textarea + the original WA message visible; Vincent provides context. (When `HERMES_REDRAFT_WEBHOOK` is wired in the sidecar's env, this will POST back to Jimmy for a redraft. Until then, the context is just stored on the draft.) |\n| **Pass** | Sidecar deletes the draft |\n\n### JSON escaping when `draft_text` contains quotes\n\nSingle quotes (`It's`, `God's`) break shell interpolation. Write the JSON to a temp file first:\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\"contact\":\"6591234567@s.whatsapp.net\",\"contact_name\":\"John\",\"summary\":\"…\",\"draft_text\":\"It's nice to hear from you!\"}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\n### Networking\n\n`https://jimmy-draft.huper.asia` is the public HTTPS endpoint of the sidecar (TLS-terminated by Traefik on ubuntu-server). HTTPS is required because Hermes blocks plain-HTTP commands via tirith. The internal Docker DNS hostname http://jimmy-wa-v64mqe:8080 also works for non-shell calls but the agent is forbidden from using it.\n\n### Sidecar status & logs\n\n- Source: (auto-deploys on push)\n- Public URL: (`/health` for ping)\n- Container: `jimmy-wa-v64mqe.1.` on ubuntu-server\n- Bot: `@jimmy_WA_bot`\n\n## Handling a Redraft Trigger (`wa-redraft` webhook)\n\nWhen Vincent clicks the **Context** button on a draft, the sidecar:\n1. Edits the original Telegram message to \"🔄 Adding context for X… agent is redrafting.\" (removes the old buttons so Vincent can't approve the now-stale text)\n2. POSTs to your `wa-redraft` webhook (`http://localhost:8644/webhooks/wa-redraft`) with the original draft fields plus Vincent's `context` instructions and the `extras` you stored when you first created the draft.\n\nWhen you receive a `wa-redraft` activation, the prompt template will lay out the fields. Your job:\n\n1. Read Vincent's `context` carefully — that's the steering signal. The `previous_draft` is just history, not necessarily a starting point.\n2. Re-fetch recent WA history if the original context isn't enough (`python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`). Use `extras.group_jid` for groups or `contact` for DMs.\n3. Write a NEW draft that addresses Vincent's instructions while staying faithful to the original WA context (in `extras.original_message`, `extras.extracted`).\n4. POST the new draft back to the sidecar **with the same `draft_id`** so the existing draft is updated, not duplicated:\n\n```bash\ncat > /tmp/wa_redraft.json <\",\n \"contact\": \"\",\n \"contact_name\": \"\",\n \"summary\": \"\",\n \"draft_text\": \"\",\n \"extras\": \n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_redraft.json\n```\n\nThe sidecar will send a NEW Telegram message with 4 fresh buttons. Vincent can Approve, Edit, Context-again (loop), or Pass.\n\nDo **not** send to WhatsApp directly during a redraft — the approval flow is the sidecar's job.\n\n## Incoming Message Processing\n\n### Delivery Pipeline (current)\n1. Evolution API POSTs each `MESSAGES_UPSERT` webhook directly to Hermes at `http://localhost:8644/webhooks/whatsapp-incoming` (proxied via `jimmy.huper.asia/webhooks/whatsapp-incoming`)\n2. Hermes webhook route (`whatsapp-incoming`, dynamic subscription) validates and spawns an agent\n3. Agent loads `whatsapp-evolution` skill, follows the rules below\n4. Webhook subscription uses `deliver: log` because webhooks do not support `deliver: local`; the agent explicitly calls `send_message(target=\"telegram\", ...)` only for important/actionable/VIP/event/birthday/prayer/error items, OR posts a draft to the sidecar for approval items.\n\n**No more OpenClaw, no Python webhook server, no port 5001, no keepalive cron.** If messages aren't being processed, check: nginx proxy to port 8644, `hermes webhook list` for the subscription, Evolution API webhook URL setting.\n\n## Contact Rules & Escalation\n\nConfig: `memory/whatsapp-vip-contacts.json`\n\n### Priority Levels\n\n| Priority | Action | Example |\n|----------|--------|---------|\n| **VIP (call-always)** | **CALL Vincent immediately via Twilio without asking.** Do NOT ask for approval first. During quiet hours (1am-10am SGT) suppress call and use Telegram notification only, unless life-and-death emergency. Match VIPs by both phone JID (`659...@s.whatsapp.net`) and LID JID (`...@lid`) when present, because recent chats may arrive with `key.remoteJid` as an LID and the phone number only in `key.remoteJidAlt`. | Dn Jeff (+6597630553) |\n| **Family** | Always notify Vincent on Telegram | ps112 group, family members |\n| **Auto-reply** | Reply based on context file, notify Vincent | Repair shops |\n| **Normal** | Analyze intent, reply if appropriate, notify if important | Everyone else |\n| **Broadcast/Spam** | Ignore silently | Forwarded job posts, chain messages |\n\n### Quiet Hours\n- 1am-10am SGT (17:00-02:00 UTC): avoid phone calls, including for Dn Jeff / VIP call-always contacts\n- **Exception:** life-and-death emergencies always call regardless of quiet hours\n- During quiet hours, use Telegram notification only and mention that a call was suppressed due to quiet hours\n\n## Making a Call via Twilio\n\nCredentials: `/opt/data/secrets/.twilio-credentials-new` (`TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN`, `TWILIO_PHONE_NUMBER`). From: `+16614809287`, Vincent: `+6590054438`.\n\nPreferred script: `/opt/data/scripts/call_vincent.py`, now using the Twilio Python SDK baked into the container.\n\n```bash\npython3 /opt/data/scripts/call_vincent.py \"Hello Vincent, this is Jimmy. Urgent summary here.\"\n```\n\n## Action Rules\n\n### Default: DO NOT auto-reply directly, and do not spam main Telegram\nCollect intent and stay local unless something truly needs Vincent's immediate attention.\n\n**When you need Vincent to approve a reply, POST a draft to the sidecar (see \"Draft Approvals\" above).** The sidecar handles the Telegram notification with buttons, the WebApp for editing, and sending to WhatsApp on approve. Do **not** duplicate that draft or summary in the main Telegram chat.\n\nMain Telegram WhatsApp summaries are muted by default. Only use `send_message(target=\"telegram\", ...)` for urgent/important/actionable items that need Vincent now: VIP actionable messages/calls, missed/ringing family calls, emergencies, safety/medical/school-pickup issues, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors. Routine logistics, casual family pings, generic group events, links, standalone codes, normal confirmations, and suggested replies should be local-only or sidecar-only.\n\nDo NOT send WhatsApp replies directly unless Vincent explicitly approves them or the category is pre-approved.\n\n### Tone (when proposing or sending replies)\n- Casual, direct, no sign-offs (no \"Thanks & Regards\", no \"- Vincent\")\n- Ask questions instead of giving instructions\n- Keep it natural — like texting a friend\n- Christian and grace-based: warm, humble, encouraging, non-judgmental, peace-making\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- For prayer/birthday contexts, use gentle faith-filled language; avoid sounding preachy, performative, or overly formal\n\n### What to Do With Each Message\n\n**DMs (personal chats):**\n- Analyze intent\n- Notify Vincent on Telegram (via `send_message(target=\"telegram\", ...)`) only if important/actionable/time-sensitive, requires a decision, comes from a VIP contact\n- If you have a useful suggested reply, POST it as a draft to the sidecar instead of a Telegram notification — Vincent gets the buttons inline\n- Casual/social/non-actionable DMs → local log only, no Telegram notification, no draft\n- Do NOT reply unless Vincent says to\n- VIP contacts → **call Vincent immediately without asking**. Do NOT post a draft. Just call. During quiet hours (1am-10am SGT), suppress the call and send Telegram notification only, unless life-and-death emergency.\n\n**Group chats — events/meetups:**\n- If someone posts about an event, gathering, meetup, dinner, etc:\n 1. Extract: what, when, where, who\n 2. Notify Vincent on Telegram with details (no draft needed unless you also want to suggest an RSVP reply)\n 3. Ask: \"Want me to add this to Google Calendar and set a reminder?\"\n\n**Group chats — prayer requests / praying for others:**\n- If someone asks for prayer, shares a burden, illness, grief, struggle, thanksgiving, or asks the group to pray:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Identify who/what to pray for and any sensitive details\n 3. Draft a short, grace-based Christian reply/prayer\n 4. POST it to the sidecar with `summary` describing the request and `draft_text` being the prayer\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — birthday greetings:**\n- If you see birthday wishes for someone in a group:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Note who the birthday person is and relationship/context if inferable\n 3. Draft a warm, grace-based Christian birthday greeting\n 4. POST it to the sidecar\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — general:**\n- Casual banter, memes, forwards, broadcast spam → ignore silently\n- Religious content shares → ignore unless specifically addressed to Vincent\n\n### When to Notify Vincent (Telegram) vs Post a Draft\n\n**Use `send_message(target=\"telegram\", ...)` sparingly** for plain notifications only when the message needs Vincent's immediate attention: VIP actionable DMs/calls, emergencies, school pickup/child logistics requiring action now, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors.\n\n**Use the sidecar (POST `/drafts`)** whenever you have a suggested reply for Vincent to approve: prayer responses, birthday greetings, DM replies, group RSVPs. Do not also send the same summary to the main Telegram chat.\n\nDo **not** notify or draft for casual/social/non-actionable DMs, routine logistics, read receipts/status updates, religious shares not addressed to Vincent, memes, forwarded/broadcast/spam, generic group events, standalone codes, normal confirmations, or general group chatter. Family messages are not automatically main-chat alerts unless urgent, safety-related, school/pickup critical, or explicitly requiring Vincent now.\n\n## Intent Categories\n\nWhen analyzing incoming messages, classify as:\n\n1. **Urgent** — time-sensitive, emergencies → call Vincent outside quiet hours; during quiet hours use Telegram only unless life-and-death\n2. **Important** — decisions, money, appointments, anything consequential → call Vincent outside quiet hours; during quiet hours notify on Telegram only unless life-threatening\n3. **Request** — asking Vincent to do something → **notify Vincent** (Telegram) or post a draft if you have a safe reply\n4. **Enquiry** — asking for info, pricing, availability → auto-reply if context available, otherwise post a draft\n5. **Confirmation** — confirming plans, bookings → acknowledge and notify Vincent\n6. **Casual/Social** — greetings, memes, forwards → usually ignore or react\n7. **Broadcast/Spam** — forwarded job posts, chain messages → ignore\n\n## Finding Individual Contacts and JIDs\n\nWhen Vincent asks to monitor or prioritize a specific WhatsApp contact and only gives a name:\n\n1. Query contacts and chats, then search both display names and serialized message metadata:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n python3 - <<'PY'\n import os, requests, json\n base=os.environ['EVOLUTION_API_URL']; inst=os.environ['EVOLUTION_INSTANCE']; key=os.environ['EVOLUTION_API_KEY']\n headers={'apikey':key,'Content-Type':'application/json'}\n terms=['whye','keong'] # replace with lowercase search terms\n for ep in ['findContacts','findChats']:\n r=requests.post(f'{base}/chat/{ep}/{inst}', headers=headers, json={}, timeout=30)\n print('endpoint', ep, 'status', r.status_code)\n for obj in (r.json() if r.ok else []):\n if any(t in json.dumps(obj, ensure_ascii=False).lower() for t in terms):\n print(json.dumps(obj, ensure_ascii=False)[:1000])\n PY\n ```\n2. If `findChats` shows `remoteJid` ending in `@lid`, inspect `lastMessage.key.remoteJidAlt` for the phone JID. Store both in `memory/whatsapp-vip-contacts.json` when configuring VIP/call rules.\n3. Verify the number exists with `/chat/whatsappNumbers/{instance}` before finalizing.\n4. Validate the JSON config after editing: `python3 -m json.tool /opt/data/agent-state/memory/whatsapp-vip-contacts.json >/tmp/wa_vip_validated.json`.\n\n## Finding Group JIDs\n\nThe `group/fetchGroups` endpoint returns 404 on some Evolution API instances (including ours). To find a group JID when you don't have it:\n\n1. **Use `chat/findMessages` with an empty `where` filter** to pull recent messages across all chats, then extract unique `remoteJid` values containing `@g.us`.\n2. **Match by content**: Fetch messages from each group JID using `where.key.remoteJid` and match sender names or message text to identify the target group.\n3. **Known group JIDs** (store in `whatsapp-vip-contacts.json` as you discover them):\n - ps112 family group: `6591887488-1464061170@g.us`\n - Overcomers group: `6591444911-1614821287@g.us`\n - 神的华美 group: `6596756578-1405873840@g.us` (content-sharing, casual/broadcast, local log only)\n - Overcomers/TCians group: `120363230236878544@g.us` (church fellowship, PJ sends weekly attendance reports, routine attendance = local log only)\n\nExample search:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"6591444911-1614821287@g.us\"}}, \"limit\": 10, \"order\": \"desc\"}'\n```\n\n## Processing Image/Media Messages\n\nWhen an incoming WhatsApp message is an image (imageMessage), video, or other media with no text or only a caption:\n\n1. **Fetch conversation context** as usual with `wa_recent_messages.py`\n2. **Download the media** using the Evolution API:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"message\": {\"key\": {\"id\": \"MESSAGE_ID\", \"fromMe\": false, \"remoteJid\": \"GROUP_OR_CHAT_JID\"}}, \"convertToMp4\": false}' \\\n -o /tmp/wa_media_response.json\n ```\n Then decode: `python3 -c \"import json,base64; d=json.load(open('/tmp/wa_media_response.json')); open('/tmp/wa_media.jpg','wb').write(base64.b64decode(d['base64']))\"`\n3. **Analyze the image** using `vision_analyze` tool with `/tmp/wa_media.jpg` path and a context-aware question\n4. **Fallback if vision is unavailable**: Use `tesseract /tmp/wa_media.jpg stdout` for OCR, combined with pixel analysis (PIL/numpy) and conversation context to infer what the image contains\n5. **Album messages** (`albumMessage`): These bundle multiple images. Each image arrives as a separate `imageMessage` with a `messageAssociation.parentMessageKey` linking them to the album. Download each image separately. The `expectedImageCount` field tells how many images to expect\n6. **messageAssociation**: When images have `parentMessageKey` with `fromMe: true`, they are replying to a message Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an incoming WhatsApp message is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification when possible:\n\n1. **Fetch conversation context first**, especially in groups:\n ```bash\n python3 /opt/data/scripts/wa_recent_messages.py \"\" 50\n ```\n2. **Download the audio** with Evolution API, saving the response to a file:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_audio_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"GROUP_OR_CHAT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_audio_req.json -o /tmp/wa_audio_response.json\n python3 - <<'PY'\n import json, base64\n d=json.load(open('/tmp/wa_audio_response.json'))\n b=d.get('base64') or d.get('data',{}).get('base64')\n if ',' in b[:100]: b=b.split(',',1)[1]\n open('/tmp/wa_audio.ogg','wb').write(base64.b64decode(b))\n PY\n ```\n3. **Convert OGG/Opus to WAV**:\n ```bash\n ffmpeg -y -i /tmp/wa_audio.ogg -ar 16000 -ac 1 /tmp/wa_audio.wav\n ```\n4. **Transcribe fallback when Whisper is unavailable**: the container may not have `whisper`, `faster-whisper`, `torch`, or `speech_recognition` installed globally. A reliable lightweight fallback is a temp venv with SpeechRecognition and Google web speech:\n ```bash\n python3 -m venv /tmp/srvenv\n /tmp/srvenv/bin/pip install --quiet SpeechRecognition\n /tmp/srvenv/bin/python - <<'PY'\n import speech_recognition as sr\n r=sr.Recognizer()\n with sr.AudioFile('/tmp/wa_audio.wav') as source:\n audio=r.record(source)\n print(r.recognize_google(audio, language='en-SG'))\n PY\n ```\n5. **Check adjacent voice notes from the same sender**. WhatsApp voice notes are often sent in quick sequential chunks. Download and transcribe nearby `audioMessage` records from the same sender before deciding whether to notify Vincent.\n6. Use the transcript plus recent context to classify the message. For group meetup planning, extract what/when/where/who and notify Vincent if actionable, including transcript confidence.\n\n### Video Messages\n\nWhen an incoming WhatsApp message is `videoMessage` (with or without a caption):\n\n1. **Fetch conversation context** first, as with any media message: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n2. **Download the video** using Evolution API (same pattern as images/audio — save response to file, then decode):\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_video_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"CONTACT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_video_req.json -o /tmp/wa_video_response.json\n python3 - <<'PY'\n import json, base64\n d = json.load(open('/tmp/wa_video_response.json'))\n b = d.get('base64') or d.get('data', {}).get('base64', '')\n if ',' in b[:100]: b = b.split(',', 1)[1]\n open('/tmp/wa_video.mp4', 'wb').write(base64.b64decode(b))\n print(f\"Video saved: {len(base64.b64decode(b))} bytes\")\n PY\n ```\n3. **Extract representative frames** with ffmpeg (do NOT try to analyze video directly — vision tools work on images):\n ```bash\n # Frame at start\n ffmpeg -y -ss 0 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame0.jpg\n # Frame at ~1/3 of the way (good for short videos)\n ffmpeg -y -ss 3 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame3s.jpg\n # If video is longer (>15s), also extract at midpoint\n # ffprobe -v quiet -print_format json -show_format /tmp/wa_video.mp4 # get duration\n # ffmpeg -y -ss $HALF_DURATION -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame_mid.jpg\n ```\n4. **Analyze the frames** with `vision_analyze`. Ask specific questions about content: what objects/people/text are visible, what type of video this appears to be (screen recording, personal clip, forwarded meme, etc.), and any text or identifying details.\n5. **Use the caption** from `videoMessage.caption` if present (can be empty/null). Combine frame analysis + caption + conversation context to classify the message intent.\n6. **Transcription of audio in video** is generally not feasible with available tools. If the video likely contains important spoken content (e.g. voice memo style), note this limitation and flag it to Vincent.\n\n**Key points:**\n- Always use `-o` flag and file-based decode; videos can be 1-10MB and base64 responses will be even larger\n- `--max-time 25` on curl is essential for video downloads\n- Short casual clips (showing a key fob, a parking spot, quick view of surroundings) are usually casual/social — classify accordingly and don't over-escalate\n- If `vision_analyze` fails (400 on models that don't support images, or 429 rate limit), fall back to conversation context inference alone\n\n### Poll Messages\nWhen `pollUpdateMessage` appears in chat history (someone voted), the message contains a `pollCreationMessageKey` with the original poll's message `id`. To get the poll question and options, query:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"id\": \"POLL_MESSAGE_ID\", \"remoteJid\": \"GROUP_JID\"}}, \"limit\": 1}'\n```\nThe response will have `pollCreationMessageV3` with `name` (the poll question) and `options` array. Always look up the poll when votes or poll references appear, since the poll text contains the event details.\n\n### `associatedChildMessage` (Motion Photos / Companion Media)\nWhatsApp motion photos and companion media arrive as `associatedChildMessage` type. These are short videos or supplementary images attached to a parent message via `messageAssociation.parentMessageKey`. The structure is:\n\n```\nmessage: {\n messageContextInfo: {\n messageAssociation: {\n associationType: 12,\n parentMessageKey: { id, fromMe, remoteJid }\n }\n },\n associatedChildMessage: {\n message: {\n videoMessage: { ... } // or imageMessage: { ... }\n }\n }\n}\n```\n\nKey points:\n- The `parentMessageKey.id` links back to the main message (the image/video they accompany)\n- When `parentMessageKey.fromMe: true`, the companion media is replying to a message Vincent sent\n- Captions may appear on the child's `videoMessage.caption` or `imageMessage.caption` inside the nested message, NOT at the top level\n- These often contain motion-photo short videos (3 seconds) or album companion images\n- Treat them as media attachments to the parent; don't process them as standalone messages requiring independent intent analysis\n\n### Sequential Image Sets (Non-Album)\nMultiple images sent rapidly by the same sender at similar timestamps may NOT be wrapped in an `albumMessage`. They appear as separate `imageMessage` entries. Always check for adjacent images from the same sender and download+OCR all of them together, as they often form a coherent multi-page set (e.g., route info, comparison tables, maps). Missing one can lose important context.\n\n### Key pitfalls\n- The Evolution API `getBase64FromMediaMessage` endpoint returns the full base64 payload (can be 100KB+ for images). Use `-o` flag to save to file, then parse with Python. Do NOT try to pipe it directly\n- `curl` with `--max-time 25` is essential; without it, large media responses can hang\n- The `albumMessage` type itself has no `fromMe` field; check each individual `imageMessage` instead\n- Image thumbnails in the webhook payload are truncated; always download the full image for analysis\n- **WhatsApp media URLs expire**: CDN download links (`mmg.whatsapp.net`) expire after roughly 24-48 hours. When `getBase64FromMediaMessage` returns a 400/404 error like \"Failed to fetch stream from https://mmg.whatsapp.net/...\", the media is no longer retrievable via the API. This means you CANNOT download adjacent/older voice notes, images, or documents for context if they're more than a day or two old. Fall back to conversation text context only. Fresh/incoming messages are almost always still downloadable.\n- **Voice note transcription reliability**: Short voice notes (<15s), especially in Singlish/Chinese-English mix common in Singapore contacts, often produce unreliable transcriptions via SpeechRecognition+Google. Always flag low-confidence transcriptions to Vincent rather than acting on potentially garbled text.\n- **Vision API failures**: `vision_analyze` can fail with 429 (rate limit) **or 400** (model doesn't support image input, e.g. glm-5.1 via ollama-cloud). Always have a fallback plan. When vision is unavailable, try tesseract OCR, then pixel analysis (PIL/numpy for brightness zones, text region detection via row-level dark-pixel density)\n- **OCR can fail on decorative/complex images**: Obituaries, memorial notices, religious flyers with ornate fonts, or dark-background images often defeat tesseract even with aggressive preprocessing (upscaling, thresholding, inverting, PSM mode variations). When OCR returns gibberish or empty, fall back to conversation context inference: use preceding/surrounding messages to determine what the image likely contains, then draft an appropriate response based on that context\n- **OCR on devotional/dark-background images**: These often defeat tesseract on full-image scans. Use a multi-strategy approach: (1) run tesseract on the full image first, (2) if text is sparse, crop into regions (top 15%, middle, bottom 15%), upscale 2-3x, enhance contrast (2-3x), and try both normal and inverted (`ImageOps.invert`) versions with `--psm 6`, (3) combine all partial OCR results. For Exodus/verse images, the title (e.g. \"WHEN MOSES' HANDS WERE\") is often in decorative font that OCR misses, while the verse body text is usually readable.\n- **Prioritize conversation context**: When processing group images, always fetch recent context FIRST (before attempting vision/OCR). If image analysis fails, the conversation history often makes the image's purpose clear enough to draft a meaningful reply\n- **DM conversation context is one-sided (fromMe only)**: The Evolution API `findMessages` endpoint often returns only Vincent's sent messages (`fromMe: true`) for DM chats, not the contact's incoming messages. This means `wa_recent_messages.py` can show only one side of the conversation. When processing a DM webhook, do NOT rely solely on fetched history for the contact's side. The webhook payload already contains the incoming message text. Use what the webhook gives you plus Vincent's side from the API. If you need the contact's recent messages, check if `fromMe: false` records exist; if none are returned, the API simply doesn't have them and you must work with what you have (webhook message + Vincent's replies for context).\n- **`findMessages` ignores `key.fromMe` filter**: Passing `\"key.fromMe\": false` in the `where` clause still returns all messages (including `fromMe: true`). To find messages from a specific sender, fetch results and filter client-side by checking `fromMe` in Python.\n- **`findMessages` caps at ~50 records per request**: Even with `limit: 200` or `limit: 500`, the API returns at most 50 records. When one party sends many consecutive messages, the other party's messages may not appear in the first 50 results. Use pagination or fetch multiple pages if you need older messages.\n- **Video frame extraction**: Use `ffmpeg -y -ss N -i video.mp4 -frames:v 1 -update 1 output.jpg` (the `-update 1` flag is needed; without it ffmpeg treats the output as an image sequence and fails). Extract frames at multiple timestamps (start, ~1/3, midpoint for longer videos) since a single frame may miss the key content.\n- **ffmpeg \"Invalid color space\" failure on WhatsApp videos**: Some WhatsApp H.264 videos have `color_space=reserved` and `color_primaries=reserved` metadata that ffmpeg cannot decode at all. Every codec/format/filter combination fails with \"Invalid color space\" / return code 234. This is NOT fixable with `-vf setparams`, `-color_trc`, `-colorspace`, `-color_primaries`, or `-vf format=` flags because the error occurs at the decoder level before any filter runs. **Do not waste time trying ffmpeg workarounds.** When this happens, fall back to: (1) conversation context inference (who sent it, what thread it's in, adjacent messages), (2) the jpegThumbnail embedded in the videoMessage (small but gives a rough idea; base64-encoded JPEG, typically 600-900 bytes, use `vision_analyze` on it if available), or (3) if all analysis fails, just classify based on surrounding text context.\n- **`wa_recent_messages.py` JSON output breaks `json.loads()`**: The script's stdout contains WhatsApp message content with control characters (emoji, CJK, deviceListMetadata binary blobs) that cause `json.loads()` to fail with `Invalid control character` or `Expecting ',' delimiter` errors. Even `strict=False` does not reliably fix it. **Reliable approach**: use `curl -o /tmp/wa_msgs.json` to save the Evolution API response directly to a file (the file write handles binary safely), then parse with `json.load(open('/tmp/wa_msgs.json'))` in Python. Example:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 15 -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"CONTACT_JID\"}}, \"limit\": 20}' -o /tmp/wa_msgs.json\n ```\n Then in Python: `data = json.load(open('/tmp/wa_msgs.json')); records = data['messages']['records']`\n\n## API Reference\n\nFor full API endpoints (media, contacts, location, buttons) — see [references/api.md](references/api.md).\n\n[Skill directory: /opt/data/skills/personal/whatsapp-evolution]\nResolve any relative paths in this skill (e.g. `scripts/foo.js`, `templates/config.yaml`) against that directory, then run them with the terminal tool using the absolute path.\n\n[This skill has supporting files:]\n- references/api.md -> /opt/data/skills/personal/whatsapp-evolution/references/api.md\n- scripts/process_queue.py -> /opt/data/skills/personal/whatsapp-evolution/scripts/process_queue.py\n- scripts/send_text.sh -> /opt/data/skills/personal/whatsapp-evolution/scripts/send_text.sh\n\nLoad any of these with skill_view(name=\"personal/whatsapp-evolution\", file_path=\"\"), or run scripts directly by absolute path (e.g. `node /opt/data/skills/personal/whatsapp-evolution/scripts/foo.js`).\n\nThe user has provided the following instruction alongside the skill invocation: Incoming WhatsApp webhook received.\n\nEvent: messages.upsert\nInstance: n8n-WA\nSender: Shen Jiaming\nSender JID: 120363418226875471@g.us\nFrom me: False\nMessage type: reactionMessage\nMessage: {\n \"event\": \"messages.upsert\",\n \"instance\": \"n8n-WA\",\n \"data\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": false,\n \"id\": \"2A1BC90CF31059BC75C1\",\n \"participant\": \"5703783682189@lid\",\n \"participantAlt\": \"6592760070@s.whatsapp.net\",\n \"addressingMode\": \"lid\"\n },\n \"pushName\": \"Shen Jiaming\",\n \"status\": \"DELIVERY_ACK\",\n \"message\": {\n \"reactionMessage\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": true,\n \"id\": \"AC37643983DC88906B4D0F1A09F74082\",\n \"participant\": \"54644919296042@lid\"\n },\n \"text\": \"\\u2764\\ufe0f\",\n \"senderTimestampMs\": {\n \"low\": -1597307827,\n \"high\": 414,\n \"unsigned\": false\n }\n }\n },\n \"messageType\": \"reactionMessage\",\n \"messageTimestamp\": 1780814121,\n \"instanceId\": \"b2bfaa30-d5f1-4b22-b979-791892619414\",\n \"source\": \"unknown\"\n },\n \"destination\": \"https://jimmy.huper.asia/webhooks/whatsapp-incoming\",\n \"date_time\": \"2026-06-07T03:35:21.432Z\",\n \"sender\": \"6590054438@s.whatsapp.net\",\n \"server_url\": \"https://evolution.huper.asia\",\n \"apikey\": \"5E4C532223A0-460B-A749-16C584F56E74\"\n}\n\nPrimary goal: REDUCE NOISE. Stay silent in the main Telegram chat unless the message truly needs Vincent's immediate attention. Approval-worthy reply drafts should go to the jimmy-wa-drafts sidecar, NOT to the main Telegram chat.\n\nFollow the whatsapp-evolution skill and /opt/data/agent-state/memory/whatsapp-vip-contacts.json, with these stricter notification rules:\n\nABSOLUTE SILENCE RULES:\n- If fromMe is true: ignore. Do NOT call send_message. Final response local log only.\n- Ignore status/update/read-receipt events or payloads with no real message content.\n- Do NOT send main Telegram summaries for ordinary DMs, routine logistics, casual family messages, generic group events, group chatter, forwards, memes, devotional/religious shares, links, standalone codes, normal confirmations, or suggested replies.\n- Do NOT ask \"Want me to add this to Calendar?\" for WhatsApp group events unless Vincent is explicitly invited, named, responsible, or the event is time-sensitive and clearly relevant to him today.\n- Do NOT surface family messages just because they are family. Only surface if urgent, safety-related, school/pickup critical, money/legal/medical, or explicitly requires Vincent now.\n\nSIDEcar RULES:\n- If a reply may be useful but it is not urgent, POST it to https://jimmy-draft.huper.asia/drafts using the whatsapp-evolution Draft Approvals flow. Do NOT duplicate that draft/summary in the main Telegram chat.\n- Prayer and birthday replies should go to the sidecar when appropriate, not the main Telegram chat, unless the underlying situation is urgent/sensitive and Vincent needs immediate awareness.\n- Before drafting any reply or approval suggestion, fetch recent context: python3 /opt/data/scripts/wa_recent_messages.py \"120363418226875471@g.us\" 30. For active/ambiguous groups use 50.\n\nMAIN TELEGRAM ALERTS ARE ALLOWED ONLY FOR:\n1. VIP contacts marked always_call or always_notify when the message is important/actionable, or any VIP call event.\n2. Calls from family/VIP that appear missed/ringing and likely need immediate attention. Avoid repeated alerts for accepted/terminated duplicate call state updates.\n3. Emergencies, safety issues, medical issues, school pickup/child logistics requiring action now, legal/money/account/security matters, or time-sensitive commitments in the next few hours.\n4. Messages explicitly asking Vincent to call back, decide now, approve payment, provide OTP/code for an active task, or respond urgently.\n5. Pipeline errors or webhook processing failures.\n\nCALL RULES:\n- VIP contacts (e.g. Dn Jeff, Ng Whye Keong): call Vincent immediately outside quiet hours when actionably important or on ringing/missed call. During 1am-10am SGT, Telegram only unless life-and-death emergency.\n- Do not call for casual VIP confirmations or non-actionable courtesy messages.\n\nTONE/RULES FOR DRAFTS:\n- Proposed replies must be Christian and grace-based where appropriate: warm, humble, encouraging, non-judgmental, peace-making, not preachy.\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- NEVER auto-reply directly on WhatsApp unless Vincent explicitly approves or future config marks that exact category pre-approved.\n\nDelivery rule:\n- For allowed main alerts only: call send_message(target=\"telegram\", message=...). Keep it concise.\n- For non-urgent reply suggestions: use the sidecar only.\n- For everything else: DO NOT call send_message. Keep final response as local-only log.\n", "timestamp": 1780814128.1355953, "anchor": true}, {"id": 104138, "role": "assistant", "content": "This is a reaction message (❤️ emoji) from Shen Jiaming reacting to a message Vincent previously sent in a group chat. No actionable content, no reply needed, no notification warranted.\n\n**Local log only. No Telegram alert, no sidecar draft.**", "timestamp": 1780814128.2689607}], "bookend_end": [], "messages_before": 0, "messages_after": 2}, {"session_id": "20260607_063519_732a7c72", "when": "June 07, 2026 at 06:35 AM", "source": "webhook", "model": "glm-5.1", "title": null, "matched_role": "user", "match_message_id": 104134, "snippet": "...true`, they are replying to a >>>message<<< Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an >>>incoming<<< WhatsApp >>>message<<< is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification...", "bookend_start": [], "messages": [{"id": 104134, "role": "user", "content": "[IMPORTANT: The user has invoked the \"whatsapp-evolution\" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]\n\n---\nname: whatsapp-evolution\ndescription: WhatsApp messaging via Evolution API — send, receive, analyze, and auto-reply. Use when processing an incoming WhatsApp message delivered via the webhook, replying on Vincent's behalf, liaising with contacts, sending messages, or handling WhatsApp-related tasks. Covers intent analysis, auto-reply rules, escalation (VIP → phone call), conversation context fetching, and draft approvals via the jimmy-wa-drafts sidecar.\n---\n\n# WhatsApp via Evolution API\n\n## Credentials\nStored in `/opt/data/secrets/.whatsapp-evolution-credentials`:\n- `EVOLUTION_API_URL`, `EVOLUTION_API_KEY`, `EVOLUTION_INSTANCE`\n\nLoad with: `export $(cat .whatsapp-evolution-credentials | xargs)`\n\n## Sending Messages (direct, no approval)\n\n```bash\nexport $(cat .whatsapp-evolution-credentials | xargs)\ncurl -s -X POST \"${EVOLUTION_API_URL}/message/sendText/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"number\": \"6591234567\", \"text\": \"Hello!\"}'\n```\n\n- Phone format: country code, no `+` (e.g. `6590054438`)\n- Groups: use JID `{groupid}@g.us`\n- For media/location/contacts — see [references/api.md](references/api.md)\n\nUse direct send only when Vincent has explicitly commanded/approved the exact send, or for pre-approved auto-replies/follow-ups. For any reply that needs Vincent's review, use the sidecar (next section).\n\n### Direct replies when Vincent names an existing WhatsApp thread\n\nIf Vincent says to “reply to X” or “confirm with X” and has clearly approved the action, but the contact JID/number is not in the current prompt:\n\n1. Search recent session history first (`session_search`) for the contact name, clinic/business name, or appointment details. Prior webhook summaries often contain the exact `remoteJid`, alternate phone JID, sender name, and context.\n2. If session history identifies the WhatsApp phone JID, send directly only if Vincent’s instruction is explicit enough to act. Keep the text concise and natural.\n3. If only a LID JID is found, look for `remoteJidAlt`/alternate phone JID in the session summary or fetch contacts/chats before sending.\n4. Do not invent phone numbers from web search when replying to an existing WhatsApp conversation. Use web search only as supporting context, not as the destination, unless Vincent explicitly asks to contact a public number.\n\n\n### Sending trip/camp details from retrieved emails\n\nWhen Vincent explicitly asks to send extracted school trip/camp details to a child or family member, send directly. Keep it practical and grounded:\n- Include dates, reporting/pick-up times, flights, venue/itinerary highlights, deadlines, forms, and safety/medication notes when found.\n- If an official packing list is missing, say that clearly. Do not invent one. You may add a short “based on itinerary/weather, prepare…” list, clearly labeled as practical guidance rather than official school instructions.\n- Avoid long explanations about the search process in WhatsApp. Send the useful details.\n\n### Sending a local image/file as media\n\nFor user-approved media sends, base64-encode the local file and call `sendMedia`:\n\n```bash\nexport $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\npython3 - <<'PY'\nimport base64, json\npath = '/tmp/file.png'\npayload = {\n 'number': '6591234567',\n 'mediatype': 'image',\n 'mimetype': 'image/png',\n 'caption': 'Caption here',\n 'media': base64.b64encode(open(path, 'rb').read()).decode(),\n 'fileName': 'file.png',\n}\nopen('/tmp/wa_media_payload.json', 'w').write(json.dumps(payload))\nPY\ncurl -s --max-time 60 -X POST \"${EVOLUTION_API_URL}/message/sendMedia/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_media_payload.json\n```\n\nWhen sending QR tickets or passes, verify the crop/image contains the full QR code and the intended attendee name before sending; the QR screenshot is effectively the redeemable ticket.\n\n## Fetching Conversation Context\n\nBefore drafting or proposing any reply, always fetch recent chat history for context:\n\n```bash\npython3 /opt/data/scripts/wa_recent_messages.py \"6591234567@s.whatsapp.net\" 30\n```\n\nUse `limit=30` by default. For active group chats or ambiguous situations, fetch up to 50 messages. Read enough context to understand who is being discussed, current tone, previous commitments, and whether Vincent has already replied. The script returns newest messages first (descending order).\n\n## Draft Approvals — `jimmy-wa-drafts` sidecar\n\nWhen you need Vincent to approve a WhatsApp reply, POST the draft to the sidecar. It owns the entire approval flow: Telegram message with 4 inline buttons, Telegram WebApp for Edit/Context, sending to WhatsApp via Evolution API on Approve.\n\n### Posting a draft\n\nAlways include an `extras` object with the original WA context. The sidecar passes `extras` through verbatim if Vincent later clicks \"Context\" — without it, your redraft will be flying blind.\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\n \"contact\": \"120363351787895353@g.us\",\n \"contact_name\": \"Cycling group\",\n \"summary\": \"Karen sent a blessing over the bike ride; she's not joining\",\n \"draft_text\": \"Thanks Karen, will miss you both — praying for a good time too 🙏\",\n \"extras\": {\n \"group_jid\": \"120363351787895353@g.us\",\n \"sender_name\": \"Karen Tan\",\n \"sender_jid\": \"6597630553@s.whatsapp.net\",\n \"category\": \"prayer/blessing\",\n \"original_message\": \"Happy cycling, to those who can make it… Dn Jeff and I had made plans for this PM so we won't be able to join y'all. Praying the Lord's favor, protection and good-weather blessings be upon y'all and your route!\",\n \"extracted\": {\n \"what\": \"Cycling from East Coast Park to Changi Village, then dinner\",\n \"when\": \"Today, 4:00 PM\",\n \"where\": \"East Coast Park to Changi Village\",\n \"who\": \"Group members who voted yes for cycling/dinner or dinner\"\n },\n \"confidence\": \"high\",\n \"why_appropriate\": \"Multiple group messages are blessings/wishes; Vincent has not replied; tone matches grace-based norm.\",\n \"wa_message_id\": \"ABC123XYZ\"\n }\n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\nResponse: `{\"draft_id\":\"abc123def\",\"telegram_message_id\":42}`\n\nWhat to put in `extras` (use whatever makes sense for the message type):\n\n- **Always**: `original_message` (verbatim WA text), `sender_name`, `category` (e.g. `prayer`, `birthday`, `event`, `dm-reply`, `enquiry`)\n- **Group messages**: `group_jid`, `sender_jid`\n- **DMs**: `sender_jid`\n- **Strongly recommended (drives the rich Telegram render)**: `extracted` (dict of who/what/when/where), `confidence` (`high`/`medium`/`low`), `why_appropriate` (one-line reasoning shown to Vincent so he can decide)\n- **Optional**: `wa_message_id`, `tone_hint` (e.g. `warm-grace-based`), `sender_phone` (auto-derived from `sender_jid` if omitted)\n\nVincent receives a Telegram message in the `@jimmy_WA_bot` chat with 4 inline buttons. The sidecar handles every interaction; **you do not track replies, callbacks, or message IDs**. Just POST and move on.\n\n| Button | Behavior |\n|--------|----------|\n| **Approve** | Sidecar sends draft to WhatsApp |\n| **Edit** | Opens a Telegram WebApp with draft pre-filled; Vincent edits + sends; sidecar pushes to WhatsApp |\n| **Context** | Opens a Telegram WebApp with empty textarea + the original WA message visible; Vincent provides context. (When `HERMES_REDRAFT_WEBHOOK` is wired in the sidecar's env, this will POST back to Jimmy for a redraft. Until then, the context is just stored on the draft.) |\n| **Pass** | Sidecar deletes the draft |\n\n### JSON escaping when `draft_text` contains quotes\n\nSingle quotes (`It's`, `God's`) break shell interpolation. Write the JSON to a temp file first:\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\"contact\":\"6591234567@s.whatsapp.net\",\"contact_name\":\"John\",\"summary\":\"…\",\"draft_text\":\"It's nice to hear from you!\"}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\n### Networking\n\n`https://jimmy-draft.huper.asia` is the public HTTPS endpoint of the sidecar (TLS-terminated by Traefik on ubuntu-server). HTTPS is required because Hermes blocks plain-HTTP commands via tirith. The internal Docker DNS hostname http://jimmy-wa-v64mqe:8080 also works for non-shell calls but the agent is forbidden from using it.\n\n### Sidecar status & logs\n\n- Source: (auto-deploys on push)\n- Public URL: (`/health` for ping)\n- Container: `jimmy-wa-v64mqe.1.` on ubuntu-server\n- Bot: `@jimmy_WA_bot`\n\n## Handling a Redraft Trigger (`wa-redraft` webhook)\n\nWhen Vincent clicks the **Context** button on a draft, the sidecar:\n1. Edits the original Telegram message to \"🔄 Adding context for X… agent is redrafting.\" (removes the old buttons so Vincent can't approve the now-stale text)\n2. POSTs to your `wa-redraft` webhook (`http://localhost:8644/webhooks/wa-redraft`) with the original draft fields plus Vincent's `context` instructions and the `extras` you stored when you first created the draft.\n\nWhen you receive a `wa-redraft` activation, the prompt template will lay out the fields. Your job:\n\n1. Read Vincent's `context` carefully — that's the steering signal. The `previous_draft` is just history, not necessarily a starting point.\n2. Re-fetch recent WA history if the original context isn't enough (`python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`). Use `extras.group_jid` for groups or `contact` for DMs.\n3. Write a NEW draft that addresses Vincent's instructions while staying faithful to the original WA context (in `extras.original_message`, `extras.extracted`).\n4. POST the new draft back to the sidecar **with the same `draft_id`** so the existing draft is updated, not duplicated:\n\n```bash\ncat > /tmp/wa_redraft.json <\",\n \"contact\": \"\",\n \"contact_name\": \"\",\n \"summary\": \"\",\n \"draft_text\": \"\",\n \"extras\": \n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_redraft.json\n```\n\nThe sidecar will send a NEW Telegram message with 4 fresh buttons. Vincent can Approve, Edit, Context-again (loop), or Pass.\n\nDo **not** send to WhatsApp directly during a redraft — the approval flow is the sidecar's job.\n\n## Incoming Message Processing\n\n### Delivery Pipeline (current)\n1. Evolution API POSTs each `MESSAGES_UPSERT` webhook directly to Hermes at `http://localhost:8644/webhooks/whatsapp-incoming` (proxied via `jimmy.huper.asia/webhooks/whatsapp-incoming`)\n2. Hermes webhook route (`whatsapp-incoming`, dynamic subscription) validates and spawns an agent\n3. Agent loads `whatsapp-evolution` skill, follows the rules below\n4. Webhook subscription uses `deliver: log` because webhooks do not support `deliver: local`; the agent explicitly calls `send_message(target=\"telegram\", ...)` only for important/actionable/VIP/event/birthday/prayer/error items, OR posts a draft to the sidecar for approval items.\n\n**No more OpenClaw, no Python webhook server, no port 5001, no keepalive cron.** If messages aren't being processed, check: nginx proxy to port 8644, `hermes webhook list` for the subscription, Evolution API webhook URL setting.\n\n## Contact Rules & Escalation\n\nConfig: `memory/whatsapp-vip-contacts.json`\n\n### Priority Levels\n\n| Priority | Action | Example |\n|----------|--------|---------|\n| **VIP (call-always)** | **CALL Vincent immediately via Twilio without asking.** Do NOT ask for approval first. During quiet hours (1am-10am SGT) suppress call and use Telegram notification only, unless life-and-death emergency. Match VIPs by both phone JID (`659...@s.whatsapp.net`) and LID JID (`...@lid`) when present, because recent chats may arrive with `key.remoteJid` as an LID and the phone number only in `key.remoteJidAlt`. | Dn Jeff (+6597630553) |\n| **Family** | Always notify Vincent on Telegram | ps112 group, family members |\n| **Auto-reply** | Reply based on context file, notify Vincent | Repair shops |\n| **Normal** | Analyze intent, reply if appropriate, notify if important | Everyone else |\n| **Broadcast/Spam** | Ignore silently | Forwarded job posts, chain messages |\n\n### Quiet Hours\n- 1am-10am SGT (17:00-02:00 UTC): avoid phone calls, including for Dn Jeff / VIP call-always contacts\n- **Exception:** life-and-death emergencies always call regardless of quiet hours\n- During quiet hours, use Telegram notification only and mention that a call was suppressed due to quiet hours\n\n## Making a Call via Twilio\n\nCredentials: `/opt/data/secrets/.twilio-credentials-new` (`TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN`, `TWILIO_PHONE_NUMBER`). From: `+16614809287`, Vincent: `+6590054438`.\n\nPreferred script: `/opt/data/scripts/call_vincent.py`, now using the Twilio Python SDK baked into the container.\n\n```bash\npython3 /opt/data/scripts/call_vincent.py \"Hello Vincent, this is Jimmy. Urgent summary here.\"\n```\n\n## Action Rules\n\n### Default: DO NOT auto-reply directly, and do not spam main Telegram\nCollect intent and stay local unless something truly needs Vincent's immediate attention.\n\n**When you need Vincent to approve a reply, POST a draft to the sidecar (see \"Draft Approvals\" above).** The sidecar handles the Telegram notification with buttons, the WebApp for editing, and sending to WhatsApp on approve. Do **not** duplicate that draft or summary in the main Telegram chat.\n\nMain Telegram WhatsApp summaries are muted by default. Only use `send_message(target=\"telegram\", ...)` for urgent/important/actionable items that need Vincent now: VIP actionable messages/calls, missed/ringing family calls, emergencies, safety/medical/school-pickup issues, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors. Routine logistics, casual family pings, generic group events, links, standalone codes, normal confirmations, and suggested replies should be local-only or sidecar-only.\n\nDo NOT send WhatsApp replies directly unless Vincent explicitly approves them or the category is pre-approved.\n\n### Tone (when proposing or sending replies)\n- Casual, direct, no sign-offs (no \"Thanks & Regards\", no \"- Vincent\")\n- Ask questions instead of giving instructions\n- Keep it natural — like texting a friend\n- Christian and grace-based: warm, humble, encouraging, non-judgmental, peace-making\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- For prayer/birthday contexts, use gentle faith-filled language; avoid sounding preachy, performative, or overly formal\n\n### What to Do With Each Message\n\n**DMs (personal chats):**\n- Analyze intent\n- Notify Vincent on Telegram (via `send_message(target=\"telegram\", ...)`) only if important/actionable/time-sensitive, requires a decision, comes from a VIP contact\n- If you have a useful suggested reply, POST it as a draft to the sidecar instead of a Telegram notification — Vincent gets the buttons inline\n- Casual/social/non-actionable DMs → local log only, no Telegram notification, no draft\n- Do NOT reply unless Vincent says to\n- VIP contacts → **call Vincent immediately without asking**. Do NOT post a draft. Just call. During quiet hours (1am-10am SGT), suppress the call and send Telegram notification only, unless life-and-death emergency.\n\n**Group chats — events/meetups:**\n- If someone posts about an event, gathering, meetup, dinner, etc:\n 1. Extract: what, when, where, who\n 2. Notify Vincent on Telegram with details (no draft needed unless you also want to suggest an RSVP reply)\n 3. Ask: \"Want me to add this to Google Calendar and set a reminder?\"\n\n**Group chats — prayer requests / praying for others:**\n- If someone asks for prayer, shares a burden, illness, grief, struggle, thanksgiving, or asks the group to pray:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Identify who/what to pray for and any sensitive details\n 3. Draft a short, grace-based Christian reply/prayer\n 4. POST it to the sidecar with `summary` describing the request and `draft_text` being the prayer\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — birthday greetings:**\n- If you see birthday wishes for someone in a group:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Note who the birthday person is and relationship/context if inferable\n 3. Draft a warm, grace-based Christian birthday greeting\n 4. POST it to the sidecar\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — general:**\n- Casual banter, memes, forwards, broadcast spam → ignore silently\n- Religious content shares → ignore unless specifically addressed to Vincent\n\n### When to Notify Vincent (Telegram) vs Post a Draft\n\n**Use `send_message(target=\"telegram\", ...)` sparingly** for plain notifications only when the message needs Vincent's immediate attention: VIP actionable DMs/calls, emergencies, school pickup/child logistics requiring action now, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors.\n\n**Use the sidecar (POST `/drafts`)** whenever you have a suggested reply for Vincent to approve: prayer responses, birthday greetings, DM replies, group RSVPs. Do not also send the same summary to the main Telegram chat.\n\nDo **not** notify or draft for casual/social/non-actionable DMs, routine logistics, read receipts/status updates, religious shares not addressed to Vincent, memes, forwarded/broadcast/spam, generic group events, standalone codes, normal confirmations, or general group chatter. Family messages are not automatically main-chat alerts unless urgent, safety-related, school/pickup critical, or explicitly requiring Vincent now.\n\n## Intent Categories\n\nWhen analyzing incoming messages, classify as:\n\n1. **Urgent** — time-sensitive, emergencies → call Vincent outside quiet hours; during quiet hours use Telegram only unless life-and-death\n2. **Important** — decisions, money, appointments, anything consequential → call Vincent outside quiet hours; during quiet hours notify on Telegram only unless life-threatening\n3. **Request** — asking Vincent to do something → **notify Vincent** (Telegram) or post a draft if you have a safe reply\n4. **Enquiry** — asking for info, pricing, availability → auto-reply if context available, otherwise post a draft\n5. **Confirmation** — confirming plans, bookings → acknowledge and notify Vincent\n6. **Casual/Social** — greetings, memes, forwards → usually ignore or react\n7. **Broadcast/Spam** — forwarded job posts, chain messages → ignore\n\n## Finding Individual Contacts and JIDs\n\nWhen Vincent asks to monitor or prioritize a specific WhatsApp contact and only gives a name:\n\n1. Query contacts and chats, then search both display names and serialized message metadata:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n python3 - <<'PY'\n import os, requests, json\n base=os.environ['EVOLUTION_API_URL']; inst=os.environ['EVOLUTION_INSTANCE']; key=os.environ['EVOLUTION_API_KEY']\n headers={'apikey':key,'Content-Type':'application/json'}\n terms=['whye','keong'] # replace with lowercase search terms\n for ep in ['findContacts','findChats']:\n r=requests.post(f'{base}/chat/{ep}/{inst}', headers=headers, json={}, timeout=30)\n print('endpoint', ep, 'status', r.status_code)\n for obj in (r.json() if r.ok else []):\n if any(t in json.dumps(obj, ensure_ascii=False).lower() for t in terms):\n print(json.dumps(obj, ensure_ascii=False)[:1000])\n PY\n ```\n2. If `findChats` shows `remoteJid` ending in `@lid`, inspect `lastMessage.key.remoteJidAlt` for the phone JID. Store both in `memory/whatsapp-vip-contacts.json` when configuring VIP/call rules.\n3. Verify the number exists with `/chat/whatsappNumbers/{instance}` before finalizing.\n4. Validate the JSON config after editing: `python3 -m json.tool /opt/data/agent-state/memory/whatsapp-vip-contacts.json >/tmp/wa_vip_validated.json`.\n\n## Finding Group JIDs\n\nThe `group/fetchGroups` endpoint returns 404 on some Evolution API instances (including ours). To find a group JID when you don't have it:\n\n1. **Use `chat/findMessages` with an empty `where` filter** to pull recent messages across all chats, then extract unique `remoteJid` values containing `@g.us`.\n2. **Match by content**: Fetch messages from each group JID using `where.key.remoteJid` and match sender names or message text to identify the target group.\n3. **Known group JIDs** (store in `whatsapp-vip-contacts.json` as you discover them):\n - ps112 family group: `6591887488-1464061170@g.us`\n - Overcomers group: `6591444911-1614821287@g.us`\n - 神的华美 group: `6596756578-1405873840@g.us` (content-sharing, casual/broadcast, local log only)\n - Overcomers/TCians group: `120363230236878544@g.us` (church fellowship, PJ sends weekly attendance reports, routine attendance = local log only)\n\nExample search:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"6591444911-1614821287@g.us\"}}, \"limit\": 10, \"order\": \"desc\"}'\n```\n\n## Processing Image/Media Messages\n\nWhen an incoming WhatsApp message is an image (imageMessage), video, or other media with no text or only a caption:\n\n1. **Fetch conversation context** as usual with `wa_recent_messages.py`\n2. **Download the media** using the Evolution API:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"message\": {\"key\": {\"id\": \"MESSAGE_ID\", \"fromMe\": false, \"remoteJid\": \"GROUP_OR_CHAT_JID\"}}, \"convertToMp4\": false}' \\\n -o /tmp/wa_media_response.json\n ```\n Then decode: `python3 -c \"import json,base64; d=json.load(open('/tmp/wa_media_response.json')); open('/tmp/wa_media.jpg','wb').write(base64.b64decode(d['base64']))\"`\n3. **Analyze the image** using `vision_analyze` tool with `/tmp/wa_media.jpg` path and a context-aware question\n4. **Fallback if vision is unavailable**: Use `tesseract /tmp/wa_media.jpg stdout` for OCR, combined with pixel analysis (PIL/numpy) and conversation context to infer what the image contains\n5. **Album messages** (`albumMessage`): These bundle multiple images. Each image arrives as a separate `imageMessage` with a `messageAssociation.parentMessageKey` linking them to the album. Download each image separately. The `expectedImageCount` field tells how many images to expect\n6. **messageAssociation**: When images have `parentMessageKey` with `fromMe: true`, they are replying to a message Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an incoming WhatsApp message is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification when possible:\n\n1. **Fetch conversation context first**, especially in groups:\n ```bash\n python3 /opt/data/scripts/wa_recent_messages.py \"\" 50\n ```\n2. **Download the audio** with Evolution API, saving the response to a file:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_audio_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"GROUP_OR_CHAT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_audio_req.json -o /tmp/wa_audio_response.json\n python3 - <<'PY'\n import json, base64\n d=json.load(open('/tmp/wa_audio_response.json'))\n b=d.get('base64') or d.get('data',{}).get('base64')\n if ',' in b[:100]: b=b.split(',',1)[1]\n open('/tmp/wa_audio.ogg','wb').write(base64.b64decode(b))\n PY\n ```\n3. **Convert OGG/Opus to WAV**:\n ```bash\n ffmpeg -y -i /tmp/wa_audio.ogg -ar 16000 -ac 1 /tmp/wa_audio.wav\n ```\n4. **Transcribe fallback when Whisper is unavailable**: the container may not have `whisper`, `faster-whisper`, `torch`, or `speech_recognition` installed globally. A reliable lightweight fallback is a temp venv with SpeechRecognition and Google web speech:\n ```bash\n python3 -m venv /tmp/srvenv\n /tmp/srvenv/bin/pip install --quiet SpeechRecognition\n /tmp/srvenv/bin/python - <<'PY'\n import speech_recognition as sr\n r=sr.Recognizer()\n with sr.AudioFile('/tmp/wa_audio.wav') as source:\n audio=r.record(source)\n print(r.recognize_google(audio, language='en-SG'))\n PY\n ```\n5. **Check adjacent voice notes from the same sender**. WhatsApp voice notes are often sent in quick sequential chunks. Download and transcribe nearby `audioMessage` records from the same sender before deciding whether to notify Vincent.\n6. Use the transcript plus recent context to classify the message. For group meetup planning, extract what/when/where/who and notify Vincent if actionable, including transcript confidence.\n\n### Video Messages\n\nWhen an incoming WhatsApp message is `videoMessage` (with or without a caption):\n\n1. **Fetch conversation context** first, as with any media message: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n2. **Download the video** using Evolution API (same pattern as images/audio — save response to file, then decode):\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_video_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"CONTACT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_video_req.json -o /tmp/wa_video_response.json\n python3 - <<'PY'\n import json, base64\n d = json.load(open('/tmp/wa_video_response.json'))\n b = d.get('base64') or d.get('data', {}).get('base64', '')\n if ',' in b[:100]: b = b.split(',', 1)[1]\n open('/tmp/wa_video.mp4', 'wb').write(base64.b64decode(b))\n print(f\"Video saved: {len(base64.b64decode(b))} bytes\")\n PY\n ```\n3. **Extract representative frames** with ffmpeg (do NOT try to analyze video directly — vision tools work on images):\n ```bash\n # Frame at start\n ffmpeg -y -ss 0 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame0.jpg\n # Frame at ~1/3 of the way (good for short videos)\n ffmpeg -y -ss 3 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame3s.jpg\n # If video is longer (>15s), also extract at midpoint\n # ffprobe -v quiet -print_format json -show_format /tmp/wa_video.mp4 # get duration\n # ffmpeg -y -ss $HALF_DURATION -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame_mid.jpg\n ```\n4. **Analyze the frames** with `vision_analyze`. Ask specific questions about content: what objects/people/text are visible, what type of video this appears to be (screen recording, personal clip, forwarded meme, etc.), and any text or identifying details.\n5. **Use the caption** from `videoMessage.caption` if present (can be empty/null). Combine frame analysis + caption + conversation context to classify the message intent.\n6. **Transcription of audio in video** is generally not feasible with available tools. If the video likely contains important spoken content (e.g. voice memo style), note this limitation and flag it to Vincent.\n\n**Key points:**\n- Always use `-o` flag and file-based decode; videos can be 1-10MB and base64 responses will be even larger\n- `--max-time 25` on curl is essential for video downloads\n- Short casual clips (showing a key fob, a parking spot, quick view of surroundings) are usually casual/social — classify accordingly and don't over-escalate\n- If `vision_analyze` fails (400 on models that don't support images, or 429 rate limit), fall back to conversation context inference alone\n\n### Poll Messages\nWhen `pollUpdateMessage` appears in chat history (someone voted), the message contains a `pollCreationMessageKey` with the original poll's message `id`. To get the poll question and options, query:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"id\": \"POLL_MESSAGE_ID\", \"remoteJid\": \"GROUP_JID\"}}, \"limit\": 1}'\n```\nThe response will have `pollCreationMessageV3` with `name` (the poll question) and `options` array. Always look up the poll when votes or poll references appear, since the poll text contains the event details.\n\n### `associatedChildMessage` (Motion Photos / Companion Media)\nWhatsApp motion photos and companion media arrive as `associatedChildMessage` type. These are short videos or supplementary images attached to a parent message via `messageAssociation.parentMessageKey`. The structure is:\n\n```\nmessage: {\n messageContextInfo: {\n messageAssociation: {\n associationType: 12,\n parentMessageKey: { id, fromMe, remoteJid }\n }\n },\n associatedChildMessage: {\n message: {\n videoMessage: { ... } // or imageMessage: { ... }\n }\n }\n}\n```\n\nKey points:\n- The `parentMessageKey.id` links back to the main message (the image/video they accompany)\n- When `parentMessageKey.fromMe: true`, the companion media is replying to a message Vincent sent\n- Captions may appear on the child's `videoMessage.caption` or `imageMessage.caption` inside the nested message, NOT at the top level\n- These often contain motion-photo short videos (3 seconds) or album companion images\n- Treat them as media attachments to the parent; don't process them as standalone messages requiring independent intent analysis\n\n### Sequential Image Sets (Non-Album)\nMultiple images sent rapidly by the same sender at similar timestamps may NOT be wrapped in an `albumMessage`. They appear as separate `imageMessage` entries. Always check for adjacent images from the same sender and download+OCR all of them together, as they often form a coherent multi-page set (e.g., route info, comparison tables, maps). Missing one can lose important context.\n\n### Key pitfalls\n- The Evolution API `getBase64FromMediaMessage` endpoint returns the full base64 payload (can be 100KB+ for images). Use `-o` flag to save to file, then parse with Python. Do NOT try to pipe it directly\n- `curl` with `--max-time 25` is essential; without it, large media responses can hang\n- The `albumMessage` type itself has no `fromMe` field; check each individual `imageMessage` instead\n- Image thumbnails in the webhook payload are truncated; always download the full image for analysis\n- **WhatsApp media URLs expire**: CDN download links (`mmg.whatsapp.net`) expire after roughly 24-48 hours. When `getBase64FromMediaMessage` returns a 400/404 error like \"Failed to fetch stream from https://mmg.whatsapp.net/...\", the media is no longer retrievable via the API. This means you CANNOT download adjacent/older voice notes, images, or documents for context if they're more than a day or two old. Fall back to conversation text context only. Fresh/incoming messages are almost always still downloadable.\n- **Voice note transcription reliability**: Short voice notes (<15s), especially in Singlish/Chinese-English mix common in Singapore contacts, often produce unreliable transcriptions via SpeechRecognition+Google. Always flag low-confidence transcriptions to Vincent rather than acting on potentially garbled text.\n- **Vision API failures**: `vision_analyze` can fail with 429 (rate limit) **or 400** (model doesn't support image input, e.g. glm-5.1 via ollama-cloud). Always have a fallback plan. When vision is unavailable, try tesseract OCR, then pixel analysis (PIL/numpy for brightness zones, text region detection via row-level dark-pixel density)\n- **OCR can fail on decorative/complex images**: Obituaries, memorial notices, religious flyers with ornate fonts, or dark-background images often defeat tesseract even with aggressive preprocessing (upscaling, thresholding, inverting, PSM mode variations). When OCR returns gibberish or empty, fall back to conversation context inference: use preceding/surrounding messages to determine what the image likely contains, then draft an appropriate response based on that context\n- **OCR on devotional/dark-background images**: These often defeat tesseract on full-image scans. Use a multi-strategy approach: (1) run tesseract on the full image first, (2) if text is sparse, crop into regions (top 15%, middle, bottom 15%), upscale 2-3x, enhance contrast (2-3x), and try both normal and inverted (`ImageOps.invert`) versions with `--psm 6`, (3) combine all partial OCR results. For Exodus/verse images, the title (e.g. \"WHEN MOSES' HANDS WERE\") is often in decorative font that OCR misses, while the verse body text is usually readable.\n- **Prioritize conversation context**: When processing group images, always fetch recent context FIRST (before attempting vision/OCR). If image analysis fails, the conversation history often makes the image's purpose clear enough to draft a meaningful reply\n- **DM conversation context is one-sided (fromMe only)**: The Evolution API `findMessages` endpoint often returns only Vincent's sent messages (`fromMe: true`) for DM chats, not the contact's incoming messages. This means `wa_recent_messages.py` can show only one side of the conversation. When processing a DM webhook, do NOT rely solely on fetched history for the contact's side. The webhook payload already contains the incoming message text. Use what the webhook gives you plus Vincent's side from the API. If you need the contact's recent messages, check if `fromMe: false` records exist; if none are returned, the API simply doesn't have them and you must work with what you have (webhook message + Vincent's replies for context).\n- **`findMessages` ignores `key.fromMe` filter**: Passing `\"key.fromMe\": false` in the `where` clause still returns all messages (including `fromMe: true`). To find messages from a specific sender, fetch results and filter client-side by checking `fromMe` in Python.\n- **`findMessages` caps at ~50 records per request**: Even with `limit: 200` or `limit: 500`, the API returns at most 50 records. When one party sends many consecutive messages, the other party's messages may not appear in the first 50 results. Use pagination or fetch multiple pages if you need older messages.\n- **Video frame extraction**: Use `ffmpeg -y -ss N -i video.mp4 -frames:v 1 -update 1 output.jpg` (the `-update 1` flag is needed; without it ffmpeg treats the output as an image sequence and fails). Extract frames at multiple timestamps (start, ~1/3, midpoint for longer videos) since a single frame may miss the key content.\n- **ffmpeg \"Invalid color space\" failure on WhatsApp videos**: Some WhatsApp H.264 videos have `color_space=reserved` and `color_primaries=reserved` metadata that ffmpeg cannot decode at all. Every codec/format/filter combination fails with \"Invalid color space\" / return code 234. This is NOT fixable with `-vf setparams`, `-color_trc`, `-colorspace`, `-color_primaries`, or `-vf format=` flags because the error occurs at the decoder level before any filter runs. **Do not waste time trying ffmpeg workarounds.** When this happens, fall back to: (1) conversation context inference (who sent it, what thread it's in, adjacent messages), (2) the jpegThumbnail embedded in the videoMessage (small but gives a rough idea; base64-encoded JPEG, typically 600-900 bytes, use `vision_analyze` on it if available), or (3) if all analysis fails, just classify based on surrounding text context.\n- **`wa_recent_messages.py` JSON output breaks `json.loads()`**: The script's stdout contains WhatsApp message content with control characters (emoji, CJK, deviceListMetadata binary blobs) that cause `json.loads()` to fail with `Invalid control character` or `Expecting ',' delimiter` errors. Even `strict=False` does not reliably fix it. **Reliable approach**: use `curl -o /tmp/wa_msgs.json` to save the Evolution API response directly to a file (the file write handles binary safely), then parse with `json.load(open('/tmp/wa_msgs.json'))` in Python. Example:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 15 -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"CONTACT_JID\"}}, \"limit\": 20}' -o /tmp/wa_msgs.json\n ```\n Then in Python: `data = json.load(open('/tmp/wa_msgs.json')); records = data['messages']['records']`\n\n## API Reference\n\nFor full API endpoints (media, contacts, location, buttons) — see [references/api.md](references/api.md).\n\n[Skill directory: /opt/data/skills/personal/whatsapp-evolution]\nResolve any relative paths in this skill (e.g. `scripts/foo.js`, `templates/config.yaml`) against that directory, then run them with the terminal tool using the absolute path.\n\n[This skill has supporting files:]\n- references/api.md -> /opt/data/skills/personal/whatsapp-evolution/references/api.md\n- scripts/process_queue.py -> /opt/data/skills/personal/whatsapp-evolution/scripts/process_queue.py\n- scripts/send_text.sh -> /opt/data/skills/personal/whatsapp-evolution/scripts/send_text.sh\n\nLoad any of these with skill_view(name=\"personal/whatsapp-evolution\", file_path=\"\"), or run scripts directly by absolute path (e.g. `node /opt/data/skills/personal/whatsapp-evolution/scripts/foo.js`).\n\nThe user has provided the following instruction alongside the skill invocation: Incoming WhatsApp webhook received.\n\nEvent: messages.upsert\nInstance: n8n-WA\nSender: Shen Jiaming\nSender JID: 120363418226875471@g.us\nFrom me: False\nMessage type: reactionMessage\nMessage: {\n \"event\": \"messages.upsert\",\n \"instance\": \"n8n-WA\",\n \"data\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": false,\n \"id\": \"2AB4CEDC2C7B663FAD47\",\n \"participant\": \"5703783682189@lid\",\n \"participantAlt\": \"6592760070@s.whatsapp.net\",\n \"addressingMode\": \"lid\"\n },\n \"pushName\": \"Shen Jiaming\",\n \"status\": \"DELIVERY_ACK\",\n \"message\": {\n \"senderKeyDistributionMessage\": {\n \"groupId\": \"120363418226875471@g.us\",\n \"axolotlSenderKeyDistributionMessage\": {\n \"0\": 51,\n \"1\": 8,\n \"2\": 246,\n \"3\": 152,\n \"4\": 202,\n \"5\": 175,\n \"6\": 1,\n \"7\": 16,\n \"8\": 0,\n \"9\": 26,\n \"10\": 32,\n \"11\": 92,\n \"12\": 193,\n \"13\": 133,\n \"14\": 144,\n \"15\": 17,\n \"16\": 10,\n \"17\": 33,\n \"18\": 94,\n \"19\": 63,\n \"20\": 88,\n \"21\": 78,\n \"22\": 231,\n \"23\": 24,\n \"24\": 98,\n \"25\": 232,\n \"26\": 67,\n \"27\": 213,\n \"28\": 38,\n \"29\": 97,\n \"30\": 67,\n \"31\": 126,\n \"32\": 100,\n \"33\": 165,\n \"34\": 41,\n \"35\": 177,\n \"36\": 56,\n \"37\": 11,\n \"38\": 30,\n \"39\": 97,\n \"40\": 106,\n \"41\": 71,\n \"42\": 39,\n \"43\": 34,\n \"44\": 33,\n \"45\": 5,\n \"46\": 126,\n \"47\": 55,\n \"48\": 83,\n \"49\": 183,\n \"50\": 143,\n \"51\": 45,\n \"52\": 86,\n \"53\": 193,\n \"54\": 86,\n \"55\": 177,\n \"56\": 105,\n \"57\": 50,\n \"58\": 165,\n \"59\": 209,\n \"60\": 11,\n \"61\": 198,\n \"62\": 130,\n \"63\": 151,\n \"64\": 165,\n \"65\": 30,\n \"66\": 46,\n \"67\": 134,\n \"68\": 211,\n \"69\": 138,\n \"70\": 194,\n \"71\": 134,\n \"72\": 58,\n \"73\": 84,\n \"74\": 115,\n \"75\": 25,\n \"76\": 87,\n \"77\": 92\n }\n },\n \"messageContextInfo\": {\n \"threadId\": [],\n \"deviceListMetadata\": {\n \"senderKeyIndexes\": [],\n \"recipientKeyIndexes\": [],\n \"senderTimestamp\": {\n \"low\": 1780809805,\n \"high\": 0,\n \"unsigned\": true\n },\n \"recipientKeyHash\": {\n \"0\": 178,\n \"1\": 98,\n \"2\": 210,\n \"3\": 215,\n \"4\": 214,\n \"5\": 115,\n \"6\": 141,\n \"7\": 49,\n \"8\": 137,\n \"9\": 97\n },\n \"recipientTimestamp\": {\n \"low\": 1779868286,\n \"high\": 0,\n \"unsigned\": true\n }\n },\n \"deviceListMetadataVersion\": 2\n },\n \"reactionMessage\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": false,\n \"id\": \"ACAF388AF06A2F0E1C1777A7917ED565\",\n \"participant\": \"207726193995780@lid\"\n },\n \"text\": \"\\u2764\\ufe0f\",\n \"senderTimestampMs\": {\n \"low\": -1597314382,\n \"high\": 414,\n \"unsigned\": false\n }\n }\n },\n \"messageType\": \"reactionMessage\",\n \"messageTimestamp\": 1780814118,\n \"instanceId\": \"b2bfaa30-d5f1-4b22-b979-791892619414\",\n \"source\": \"unknown\"\n },\n \"destination\": \"https://jimmy.huper.asia/webhooks/whatsapp-incoming\",\n \"date_time\": \"2026-06-07T03:35:19.234Z\",\n \"sender\": \"6590054438@s.whatsapp.net\",\n \"server_url\": \"https://evolution.huper.asia\",\n \"apikey\": \"5E4C532223A0-460B-A749-16C584F56E74\"\n}\n\nPrimary goal: REDUCE NOISE. Stay silent in the main Telegram chat unless the message truly needs Vincent's immediate attention. Approval-worthy reply drafts should go to the jimmy-wa-drafts sidecar, NOT to the main Telegram chat.\n\nFollow the whatsapp-evolution skill and /opt/data/agent-state/memory/whatsapp-vip-contacts.json, with these stricter notification rules:\n\nABSOLUTE SILENCE RULES:\n- If fromMe is true: ignore. Do NOT call send_message. Final response local log only.\n- Ignore status/update/read-receipt events or payloads with no real message content.\n- Do NOT send main Telegram summaries for ordinary DMs, routine logistics, casual family messages, generic group events, group chatter, forwards, memes, devotional/religious shares, links, standalone codes, normal confirmations, or suggested replies.\n- Do NOT ask \"Want me to add this to Calendar?\" for WhatsApp group events unless Vincent is explicitly invited, named, responsible, or the event is time-sensitive and clearly relevant to him today.\n- Do NOT surface family messages just because they are family. Only surface if urgent, safety-related, school/pickup critical, money/legal/medical, or explicitly requires Vincent now.\n\nSIDEcar RULES:\n- If a reply may be useful but it is not urgent, POST it to https://jimmy-draft.huper.asia/drafts using the whatsapp-evolution Draft Approvals flow. Do NOT duplicate that draft/summary in the main Telegram chat.\n- Prayer and birthday replies should go to the sidecar when appropriate, not the main Telegram chat, unless the underlying situation is urgent/sensitive and Vincent needs immediate awareness.\n- Before drafting any reply or approval suggestion, fetch recent context: python3 /opt/data/scripts/wa_recent_messages.py \"120363418226875471@g.us\" 30. For active/ambiguous groups use 50.\n\nMAIN TELEGRAM ALERTS ARE ALLOWED ONLY FOR:\n1. VIP contacts marked always_call or always_notify when the message is important/actionable, or any VIP call event.\n2. Calls from family/VIP that appear missed/ringing and likely need immediate attention. Avoid repeated alerts for accepted/terminated duplicate call state updates.\n3. Emergencies, safety issues, medical issues, school pickup/child logistics requiring action now, legal/money/account/security matters, or time-sensitive commitments in the next few hours.\n4. Messages explicitly asking Vincent to call back, decide now, approve payment, provide OTP/code for an active task, or respond urgently.\n5. Pipeline errors or webhook processing failures.\n\nCALL RULES:\n- VIP contacts (e.g. Dn Jeff, Ng Whye Keong): call Vincent immediately outside quiet hours when actionably important or on ringing/missed call. During 1am-10am SGT, Telegram only unless life-and-death emergency.\n- Do not call for casual VIP confirmations or non-actionable courtesy messages.\n\nTONE/RULES FOR DRAFTS:\n- Proposed replies must be Christian and grace-based where appropriate: warm, humble, encouraging, non-judgmental, peace-making, not preachy.\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- NEVER auto-reply directly on WhatsApp unless Vincent explicitly approves or future config marks that exact category pre-approved.\n\nDelivery rule:\n- For allowed main alerts only: call send_message(target=\"telegram\", message=...). Keep it concise.\n- For non-urgent reply suggestions: use the sidecar only.\n- For everything else: DO NOT call send_message. Keep final response as local-only log.\n", "timestamp": 1780814124.4119925, "anchor": true}, {"id": 104135, "role": "assistant", "content": "This is a reaction message (❤️) from Shen Jiaming in a group chat. No real text content, no actionable information. Per the silence rules, this is ignored entirely.\n\n**Local log only.** No Telegram notification, no draft, no reply.", "timestamp": 1780814124.428677}], "bookend_end": [], "messages_before": 0, "messages_after": 2}, {"session_id": "20260607_063022_7169e65f", "when": "June 07, 2026 at 06:30 AM", "source": "webhook", "model": "glm-5.1", "title": null, "matched_role": "user", "match_message_id": 104131, "snippet": "...true`, they are replying to a >>>message<<< Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an >>>incoming<<< WhatsApp >>>message<<< is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification...", "bookend_start": [], "messages": [{"id": 104131, "role": "user", "content": "[IMPORTANT: The user has invoked the \"whatsapp-evolution\" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]\n\n---\nname: whatsapp-evolution\ndescription: WhatsApp messaging via Evolution API — send, receive, analyze, and auto-reply. Use when processing an incoming WhatsApp message delivered via the webhook, replying on Vincent's behalf, liaising with contacts, sending messages, or handling WhatsApp-related tasks. Covers intent analysis, auto-reply rules, escalation (VIP → phone call), conversation context fetching, and draft approvals via the jimmy-wa-drafts sidecar.\n---\n\n# WhatsApp via Evolution API\n\n## Credentials\nStored in `/opt/data/secrets/.whatsapp-evolution-credentials`:\n- `EVOLUTION_API_URL`, `EVOLUTION_API_KEY`, `EVOLUTION_INSTANCE`\n\nLoad with: `export $(cat .whatsapp-evolution-credentials | xargs)`\n\n## Sending Messages (direct, no approval)\n\n```bash\nexport $(cat .whatsapp-evolution-credentials | xargs)\ncurl -s -X POST \"${EVOLUTION_API_URL}/message/sendText/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"number\": \"6591234567\", \"text\": \"Hello!\"}'\n```\n\n- Phone format: country code, no `+` (e.g. `6590054438`)\n- Groups: use JID `{groupid}@g.us`\n- For media/location/contacts — see [references/api.md](references/api.md)\n\nUse direct send only when Vincent has explicitly commanded/approved the exact send, or for pre-approved auto-replies/follow-ups. For any reply that needs Vincent's review, use the sidecar (next section).\n\n### Direct replies when Vincent names an existing WhatsApp thread\n\nIf Vincent says to “reply to X” or “confirm with X” and has clearly approved the action, but the contact JID/number is not in the current prompt:\n\n1. Search recent session history first (`session_search`) for the contact name, clinic/business name, or appointment details. Prior webhook summaries often contain the exact `remoteJid`, alternate phone JID, sender name, and context.\n2. If session history identifies the WhatsApp phone JID, send directly only if Vincent’s instruction is explicit enough to act. Keep the text concise and natural.\n3. If only a LID JID is found, look for `remoteJidAlt`/alternate phone JID in the session summary or fetch contacts/chats before sending.\n4. Do not invent phone numbers from web search when replying to an existing WhatsApp conversation. Use web search only as supporting context, not as the destination, unless Vincent explicitly asks to contact a public number.\n\n\n### Sending trip/camp details from retrieved emails\n\nWhen Vincent explicitly asks to send extracted school trip/camp details to a child or family member, send directly. Keep it practical and grounded:\n- Include dates, reporting/pick-up times, flights, venue/itinerary highlights, deadlines, forms, and safety/medication notes when found.\n- If an official packing list is missing, say that clearly. Do not invent one. You may add a short “based on itinerary/weather, prepare…” list, clearly labeled as practical guidance rather than official school instructions.\n- Avoid long explanations about the search process in WhatsApp. Send the useful details.\n\n### Sending a local image/file as media\n\nFor user-approved media sends, base64-encode the local file and call `sendMedia`:\n\n```bash\nexport $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\npython3 - <<'PY'\nimport base64, json\npath = '/tmp/file.png'\npayload = {\n 'number': '6591234567',\n 'mediatype': 'image',\n 'mimetype': 'image/png',\n 'caption': 'Caption here',\n 'media': base64.b64encode(open(path, 'rb').read()).decode(),\n 'fileName': 'file.png',\n}\nopen('/tmp/wa_media_payload.json', 'w').write(json.dumps(payload))\nPY\ncurl -s --max-time 60 -X POST \"${EVOLUTION_API_URL}/message/sendMedia/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_media_payload.json\n```\n\nWhen sending QR tickets or passes, verify the crop/image contains the full QR code and the intended attendee name before sending; the QR screenshot is effectively the redeemable ticket.\n\n## Fetching Conversation Context\n\nBefore drafting or proposing any reply, always fetch recent chat history for context:\n\n```bash\npython3 /opt/data/scripts/wa_recent_messages.py \"6591234567@s.whatsapp.net\" 30\n```\n\nUse `limit=30` by default. For active group chats or ambiguous situations, fetch up to 50 messages. Read enough context to understand who is being discussed, current tone, previous commitments, and whether Vincent has already replied. The script returns newest messages first (descending order).\n\n## Draft Approvals — `jimmy-wa-drafts` sidecar\n\nWhen you need Vincent to approve a WhatsApp reply, POST the draft to the sidecar. It owns the entire approval flow: Telegram message with 4 inline buttons, Telegram WebApp for Edit/Context, sending to WhatsApp via Evolution API on Approve.\n\n### Posting a draft\n\nAlways include an `extras` object with the original WA context. The sidecar passes `extras` through verbatim if Vincent later clicks \"Context\" — without it, your redraft will be flying blind.\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\n \"contact\": \"120363351787895353@g.us\",\n \"contact_name\": \"Cycling group\",\n \"summary\": \"Karen sent a blessing over the bike ride; she's not joining\",\n \"draft_text\": \"Thanks Karen, will miss you both — praying for a good time too 🙏\",\n \"extras\": {\n \"group_jid\": \"120363351787895353@g.us\",\n \"sender_name\": \"Karen Tan\",\n \"sender_jid\": \"6597630553@s.whatsapp.net\",\n \"category\": \"prayer/blessing\",\n \"original_message\": \"Happy cycling, to those who can make it… Dn Jeff and I had made plans for this PM so we won't be able to join y'all. Praying the Lord's favor, protection and good-weather blessings be upon y'all and your route!\",\n \"extracted\": {\n \"what\": \"Cycling from East Coast Park to Changi Village, then dinner\",\n \"when\": \"Today, 4:00 PM\",\n \"where\": \"East Coast Park to Changi Village\",\n \"who\": \"Group members who voted yes for cycling/dinner or dinner\"\n },\n \"confidence\": \"high\",\n \"why_appropriate\": \"Multiple group messages are blessings/wishes; Vincent has not replied; tone matches grace-based norm.\",\n \"wa_message_id\": \"ABC123XYZ\"\n }\n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\nResponse: `{\"draft_id\":\"abc123def\",\"telegram_message_id\":42}`\n\nWhat to put in `extras` (use whatever makes sense for the message type):\n\n- **Always**: `original_message` (verbatim WA text), `sender_name`, `category` (e.g. `prayer`, `birthday`, `event`, `dm-reply`, `enquiry`)\n- **Group messages**: `group_jid`, `sender_jid`\n- **DMs**: `sender_jid`\n- **Strongly recommended (drives the rich Telegram render)**: `extracted` (dict of who/what/when/where), `confidence` (`high`/`medium`/`low`), `why_appropriate` (one-line reasoning shown to Vincent so he can decide)\n- **Optional**: `wa_message_id`, `tone_hint` (e.g. `warm-grace-based`), `sender_phone` (auto-derived from `sender_jid` if omitted)\n\nVincent receives a Telegram message in the `@jimmy_WA_bot` chat with 4 inline buttons. The sidecar handles every interaction; **you do not track replies, callbacks, or message IDs**. Just POST and move on.\n\n| Button | Behavior |\n|--------|----------|\n| **Approve** | Sidecar sends draft to WhatsApp |\n| **Edit** | Opens a Telegram WebApp with draft pre-filled; Vincent edits + sends; sidecar pushes to WhatsApp |\n| **Context** | Opens a Telegram WebApp with empty textarea + the original WA message visible; Vincent provides context. (When `HERMES_REDRAFT_WEBHOOK` is wired in the sidecar's env, this will POST back to Jimmy for a redraft. Until then, the context is just stored on the draft.) |\n| **Pass** | Sidecar deletes the draft |\n\n### JSON escaping when `draft_text` contains quotes\n\nSingle quotes (`It's`, `God's`) break shell interpolation. Write the JSON to a temp file first:\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\"contact\":\"6591234567@s.whatsapp.net\",\"contact_name\":\"John\",\"summary\":\"…\",\"draft_text\":\"It's nice to hear from you!\"}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\n### Networking\n\n`https://jimmy-draft.huper.asia` is the public HTTPS endpoint of the sidecar (TLS-terminated by Traefik on ubuntu-server). HTTPS is required because Hermes blocks plain-HTTP commands via tirith. The internal Docker DNS hostname http://jimmy-wa-v64mqe:8080 also works for non-shell calls but the agent is forbidden from using it.\n\n### Sidecar status & logs\n\n- Source: (auto-deploys on push)\n- Public URL: (`/health` for ping)\n- Container: `jimmy-wa-v64mqe.1.` on ubuntu-server\n- Bot: `@jimmy_WA_bot`\n\n## Handling a Redraft Trigger (`wa-redraft` webhook)\n\nWhen Vincent clicks the **Context** button on a draft, the sidecar:\n1. Edits the original Telegram message to \"🔄 Adding context for X… agent is redrafting.\" (removes the old buttons so Vincent can't approve the now-stale text)\n2. POSTs to your `wa-redraft` webhook (`http://localhost:8644/webhooks/wa-redraft`) with the original draft fields plus Vincent's `context` instructions and the `extras` you stored when you first created the draft.\n\nWhen you receive a `wa-redraft` activation, the prompt template will lay out the fields. Your job:\n\n1. Read Vincent's `context` carefully — that's the steering signal. The `previous_draft` is just history, not necessarily a starting point.\n2. Re-fetch recent WA history if the original context isn't enough (`python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`). Use `extras.group_jid` for groups or `contact` for DMs.\n3. Write a NEW draft that addresses Vincent's instructions while staying faithful to the original WA context (in `extras.original_message`, `extras.extracted`).\n4. POST the new draft back to the sidecar **with the same `draft_id`** so the existing draft is updated, not duplicated:\n\n```bash\ncat > /tmp/wa_redraft.json <\",\n \"contact\": \"\",\n \"contact_name\": \"\",\n \"summary\": \"\",\n \"draft_text\": \"\",\n \"extras\": \n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_redraft.json\n```\n\nThe sidecar will send a NEW Telegram message with 4 fresh buttons. Vincent can Approve, Edit, Context-again (loop), or Pass.\n\nDo **not** send to WhatsApp directly during a redraft — the approval flow is the sidecar's job.\n\n## Incoming Message Processing\n\n### Delivery Pipeline (current)\n1. Evolution API POSTs each `MESSAGES_UPSERT` webhook directly to Hermes at `http://localhost:8644/webhooks/whatsapp-incoming` (proxied via `jimmy.huper.asia/webhooks/whatsapp-incoming`)\n2. Hermes webhook route (`whatsapp-incoming`, dynamic subscription) validates and spawns an agent\n3. Agent loads `whatsapp-evolution` skill, follows the rules below\n4. Webhook subscription uses `deliver: log` because webhooks do not support `deliver: local`; the agent explicitly calls `send_message(target=\"telegram\", ...)` only for important/actionable/VIP/event/birthday/prayer/error items, OR posts a draft to the sidecar for approval items.\n\n**No more OpenClaw, no Python webhook server, no port 5001, no keepalive cron.** If messages aren't being processed, check: nginx proxy to port 8644, `hermes webhook list` for the subscription, Evolution API webhook URL setting.\n\n## Contact Rules & Escalation\n\nConfig: `memory/whatsapp-vip-contacts.json`\n\n### Priority Levels\n\n| Priority | Action | Example |\n|----------|--------|---------|\n| **VIP (call-always)** | **CALL Vincent immediately via Twilio without asking.** Do NOT ask for approval first. During quiet hours (1am-10am SGT) suppress call and use Telegram notification only, unless life-and-death emergency. Match VIPs by both phone JID (`659...@s.whatsapp.net`) and LID JID (`...@lid`) when present, because recent chats may arrive with `key.remoteJid` as an LID and the phone number only in `key.remoteJidAlt`. | Dn Jeff (+6597630553) |\n| **Family** | Always notify Vincent on Telegram | ps112 group, family members |\n| **Auto-reply** | Reply based on context file, notify Vincent | Repair shops |\n| **Normal** | Analyze intent, reply if appropriate, notify if important | Everyone else |\n| **Broadcast/Spam** | Ignore silently | Forwarded job posts, chain messages |\n\n### Quiet Hours\n- 1am-10am SGT (17:00-02:00 UTC): avoid phone calls, including for Dn Jeff / VIP call-always contacts\n- **Exception:** life-and-death emergencies always call regardless of quiet hours\n- During quiet hours, use Telegram notification only and mention that a call was suppressed due to quiet hours\n\n## Making a Call via Twilio\n\nCredentials: `/opt/data/secrets/.twilio-credentials-new` (`TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN`, `TWILIO_PHONE_NUMBER`). From: `+16614809287`, Vincent: `+6590054438`.\n\nPreferred script: `/opt/data/scripts/call_vincent.py`, now using the Twilio Python SDK baked into the container.\n\n```bash\npython3 /opt/data/scripts/call_vincent.py \"Hello Vincent, this is Jimmy. Urgent summary here.\"\n```\n\n## Action Rules\n\n### Default: DO NOT auto-reply directly, and do not spam main Telegram\nCollect intent and stay local unless something truly needs Vincent's immediate attention.\n\n**When you need Vincent to approve a reply, POST a draft to the sidecar (see \"Draft Approvals\" above).** The sidecar handles the Telegram notification with buttons, the WebApp for editing, and sending to WhatsApp on approve. Do **not** duplicate that draft or summary in the main Telegram chat.\n\nMain Telegram WhatsApp summaries are muted by default. Only use `send_message(target=\"telegram\", ...)` for urgent/important/actionable items that need Vincent now: VIP actionable messages/calls, missed/ringing family calls, emergencies, safety/medical/school-pickup issues, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors. Routine logistics, casual family pings, generic group events, links, standalone codes, normal confirmations, and suggested replies should be local-only or sidecar-only.\n\nDo NOT send WhatsApp replies directly unless Vincent explicitly approves them or the category is pre-approved.\n\n### Tone (when proposing or sending replies)\n- Casual, direct, no sign-offs (no \"Thanks & Regards\", no \"- Vincent\")\n- Ask questions instead of giving instructions\n- Keep it natural — like texting a friend\n- Christian and grace-based: warm, humble, encouraging, non-judgmental, peace-making\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- For prayer/birthday contexts, use gentle faith-filled language; avoid sounding preachy, performative, or overly formal\n\n### What to Do With Each Message\n\n**DMs (personal chats):**\n- Analyze intent\n- Notify Vincent on Telegram (via `send_message(target=\"telegram\", ...)`) only if important/actionable/time-sensitive, requires a decision, comes from a VIP contact\n- If you have a useful suggested reply, POST it as a draft to the sidecar instead of a Telegram notification — Vincent gets the buttons inline\n- Casual/social/non-actionable DMs → local log only, no Telegram notification, no draft\n- Do NOT reply unless Vincent says to\n- VIP contacts → **call Vincent immediately without asking**. Do NOT post a draft. Just call. During quiet hours (1am-10am SGT), suppress the call and send Telegram notification only, unless life-and-death emergency.\n\n**Group chats — events/meetups:**\n- If someone posts about an event, gathering, meetup, dinner, etc:\n 1. Extract: what, when, where, who\n 2. Notify Vincent on Telegram with details (no draft needed unless you also want to suggest an RSVP reply)\n 3. Ask: \"Want me to add this to Google Calendar and set a reminder?\"\n\n**Group chats — prayer requests / praying for others:**\n- If someone asks for prayer, shares a burden, illness, grief, struggle, thanksgiving, or asks the group to pray:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Identify who/what to pray for and any sensitive details\n 3. Draft a short, grace-based Christian reply/prayer\n 4. POST it to the sidecar with `summary` describing the request and `draft_text` being the prayer\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — birthday greetings:**\n- If you see birthday wishes for someone in a group:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Note who the birthday person is and relationship/context if inferable\n 3. Draft a warm, grace-based Christian birthday greeting\n 4. POST it to the sidecar\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — general:**\n- Casual banter, memes, forwards, broadcast spam → ignore silently\n- Religious content shares → ignore unless specifically addressed to Vincent\n\n### When to Notify Vincent (Telegram) vs Post a Draft\n\n**Use `send_message(target=\"telegram\", ...)` sparingly** for plain notifications only when the message needs Vincent's immediate attention: VIP actionable DMs/calls, emergencies, school pickup/child logistics requiring action now, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors.\n\n**Use the sidecar (POST `/drafts`)** whenever you have a suggested reply for Vincent to approve: prayer responses, birthday greetings, DM replies, group RSVPs. Do not also send the same summary to the main Telegram chat.\n\nDo **not** notify or draft for casual/social/non-actionable DMs, routine logistics, read receipts/status updates, religious shares not addressed to Vincent, memes, forwarded/broadcast/spam, generic group events, standalone codes, normal confirmations, or general group chatter. Family messages are not automatically main-chat alerts unless urgent, safety-related, school/pickup critical, or explicitly requiring Vincent now.\n\n## Intent Categories\n\nWhen analyzing incoming messages, classify as:\n\n1. **Urgent** — time-sensitive, emergencies → call Vincent outside quiet hours; during quiet hours use Telegram only unless life-and-death\n2. **Important** — decisions, money, appointments, anything consequential → call Vincent outside quiet hours; during quiet hours notify on Telegram only unless life-threatening\n3. **Request** — asking Vincent to do something → **notify Vincent** (Telegram) or post a draft if you have a safe reply\n4. **Enquiry** — asking for info, pricing, availability → auto-reply if context available, otherwise post a draft\n5. **Confirmation** — confirming plans, bookings → acknowledge and notify Vincent\n6. **Casual/Social** — greetings, memes, forwards → usually ignore or react\n7. **Broadcast/Spam** — forwarded job posts, chain messages → ignore\n\n## Finding Individual Contacts and JIDs\n\nWhen Vincent asks to monitor or prioritize a specific WhatsApp contact and only gives a name:\n\n1. Query contacts and chats, then search both display names and serialized message metadata:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n python3 - <<'PY'\n import os, requests, json\n base=os.environ['EVOLUTION_API_URL']; inst=os.environ['EVOLUTION_INSTANCE']; key=os.environ['EVOLUTION_API_KEY']\n headers={'apikey':key,'Content-Type':'application/json'}\n terms=['whye','keong'] # replace with lowercase search terms\n for ep in ['findContacts','findChats']:\n r=requests.post(f'{base}/chat/{ep}/{inst}', headers=headers, json={}, timeout=30)\n print('endpoint', ep, 'status', r.status_code)\n for obj in (r.json() if r.ok else []):\n if any(t in json.dumps(obj, ensure_ascii=False).lower() for t in terms):\n print(json.dumps(obj, ensure_ascii=False)[:1000])\n PY\n ```\n2. If `findChats` shows `remoteJid` ending in `@lid`, inspect `lastMessage.key.remoteJidAlt` for the phone JID. Store both in `memory/whatsapp-vip-contacts.json` when configuring VIP/call rules.\n3. Verify the number exists with `/chat/whatsappNumbers/{instance}` before finalizing.\n4. Validate the JSON config after editing: `python3 -m json.tool /opt/data/agent-state/memory/whatsapp-vip-contacts.json >/tmp/wa_vip_validated.json`.\n\n## Finding Group JIDs\n\nThe `group/fetchGroups` endpoint returns 404 on some Evolution API instances (including ours). To find a group JID when you don't have it:\n\n1. **Use `chat/findMessages` with an empty `where` filter** to pull recent messages across all chats, then extract unique `remoteJid` values containing `@g.us`.\n2. **Match by content**: Fetch messages from each group JID using `where.key.remoteJid` and match sender names or message text to identify the target group.\n3. **Known group JIDs** (store in `whatsapp-vip-contacts.json` as you discover them):\n - ps112 family group: `6591887488-1464061170@g.us`\n - Overcomers group: `6591444911-1614821287@g.us`\n - 神的华美 group: `6596756578-1405873840@g.us` (content-sharing, casual/broadcast, local log only)\n - Overcomers/TCians group: `120363230236878544@g.us` (church fellowship, PJ sends weekly attendance reports, routine attendance = local log only)\n\nExample search:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"6591444911-1614821287@g.us\"}}, \"limit\": 10, \"order\": \"desc\"}'\n```\n\n## Processing Image/Media Messages\n\nWhen an incoming WhatsApp message is an image (imageMessage), video, or other media with no text or only a caption:\n\n1. **Fetch conversation context** as usual with `wa_recent_messages.py`\n2. **Download the media** using the Evolution API:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"message\": {\"key\": {\"id\": \"MESSAGE_ID\", \"fromMe\": false, \"remoteJid\": \"GROUP_OR_CHAT_JID\"}}, \"convertToMp4\": false}' \\\n -o /tmp/wa_media_response.json\n ```\n Then decode: `python3 -c \"import json,base64; d=json.load(open('/tmp/wa_media_response.json')); open('/tmp/wa_media.jpg','wb').write(base64.b64decode(d['base64']))\"`\n3. **Analyze the image** using `vision_analyze` tool with `/tmp/wa_media.jpg` path and a context-aware question\n4. **Fallback if vision is unavailable**: Use `tesseract /tmp/wa_media.jpg stdout` for OCR, combined with pixel analysis (PIL/numpy) and conversation context to infer what the image contains\n5. **Album messages** (`albumMessage`): These bundle multiple images. Each image arrives as a separate `imageMessage` with a `messageAssociation.parentMessageKey` linking them to the album. Download each image separately. The `expectedImageCount` field tells how many images to expect\n6. **messageAssociation**: When images have `parentMessageKey` with `fromMe: true`, they are replying to a message Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an incoming WhatsApp message is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification when possible:\n\n1. **Fetch conversation context first**, especially in groups:\n ```bash\n python3 /opt/data/scripts/wa_recent_messages.py \"\" 50\n ```\n2. **Download the audio** with Evolution API, saving the response to a file:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_audio_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"GROUP_OR_CHAT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_audio_req.json -o /tmp/wa_audio_response.json\n python3 - <<'PY'\n import json, base64\n d=json.load(open('/tmp/wa_audio_response.json'))\n b=d.get('base64') or d.get('data',{}).get('base64')\n if ',' in b[:100]: b=b.split(',',1)[1]\n open('/tmp/wa_audio.ogg','wb').write(base64.b64decode(b))\n PY\n ```\n3. **Convert OGG/Opus to WAV**:\n ```bash\n ffmpeg -y -i /tmp/wa_audio.ogg -ar 16000 -ac 1 /tmp/wa_audio.wav\n ```\n4. **Transcribe fallback when Whisper is unavailable**: the container may not have `whisper`, `faster-whisper`, `torch`, or `speech_recognition` installed globally. A reliable lightweight fallback is a temp venv with SpeechRecognition and Google web speech:\n ```bash\n python3 -m venv /tmp/srvenv\n /tmp/srvenv/bin/pip install --quiet SpeechRecognition\n /tmp/srvenv/bin/python - <<'PY'\n import speech_recognition as sr\n r=sr.Recognizer()\n with sr.AudioFile('/tmp/wa_audio.wav') as source:\n audio=r.record(source)\n print(r.recognize_google(audio, language='en-SG'))\n PY\n ```\n5. **Check adjacent voice notes from the same sender**. WhatsApp voice notes are often sent in quick sequential chunks. Download and transcribe nearby `audioMessage` records from the same sender before deciding whether to notify Vincent.\n6. Use the transcript plus recent context to classify the message. For group meetup planning, extract what/when/where/who and notify Vincent if actionable, including transcript confidence.\n\n### Video Messages\n\nWhen an incoming WhatsApp message is `videoMessage` (with or without a caption):\n\n1. **Fetch conversation context** first, as with any media message: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n2. **Download the video** using Evolution API (same pattern as images/audio — save response to file, then decode):\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_video_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"CONTACT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_video_req.json -o /tmp/wa_video_response.json\n python3 - <<'PY'\n import json, base64\n d = json.load(open('/tmp/wa_video_response.json'))\n b = d.get('base64') or d.get('data', {}).get('base64', '')\n if ',' in b[:100]: b = b.split(',', 1)[1]\n open('/tmp/wa_video.mp4', 'wb').write(base64.b64decode(b))\n print(f\"Video saved: {len(base64.b64decode(b))} bytes\")\n PY\n ```\n3. **Extract representative frames** with ffmpeg (do NOT try to analyze video directly — vision tools work on images):\n ```bash\n # Frame at start\n ffmpeg -y -ss 0 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame0.jpg\n # Frame at ~1/3 of the way (good for short videos)\n ffmpeg -y -ss 3 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame3s.jpg\n # If video is longer (>15s), also extract at midpoint\n # ffprobe -v quiet -print_format json -show_format /tmp/wa_video.mp4 # get duration\n # ffmpeg -y -ss $HALF_DURATION -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame_mid.jpg\n ```\n4. **Analyze the frames** with `vision_analyze`. Ask specific questions about content: what objects/people/text are visible, what type of video this appears to be (screen recording, personal clip, forwarded meme, etc.), and any text or identifying details.\n5. **Use the caption** from `videoMessage.caption` if present (can be empty/null). Combine frame analysis + caption + conversation context to classify the message intent.\n6. **Transcription of audio in video** is generally not feasible with available tools. If the video likely contains important spoken content (e.g. voice memo style), note this limitation and flag it to Vincent.\n\n**Key points:**\n- Always use `-o` flag and file-based decode; videos can be 1-10MB and base64 responses will be even larger\n- `--max-time 25` on curl is essential for video downloads\n- Short casual clips (showing a key fob, a parking spot, quick view of surroundings) are usually casual/social — classify accordingly and don't over-escalate\n- If `vision_analyze` fails (400 on models that don't support images, or 429 rate limit), fall back to conversation context inference alone\n\n### Poll Messages\nWhen `pollUpdateMessage` appears in chat history (someone voted), the message contains a `pollCreationMessageKey` with the original poll's message `id`. To get the poll question and options, query:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"id\": \"POLL_MESSAGE_ID\", \"remoteJid\": \"GROUP_JID\"}}, \"limit\": 1}'\n```\nThe response will have `pollCreationMessageV3` with `name` (the poll question) and `options` array. Always look up the poll when votes or poll references appear, since the poll text contains the event details.\n\n### `associatedChildMessage` (Motion Photos / Companion Media)\nWhatsApp motion photos and companion media arrive as `associatedChildMessage` type. These are short videos or supplementary images attached to a parent message via `messageAssociation.parentMessageKey`. The structure is:\n\n```\nmessage: {\n messageContextInfo: {\n messageAssociation: {\n associationType: 12,\n parentMessageKey: { id, fromMe, remoteJid }\n }\n },\n associatedChildMessage: {\n message: {\n videoMessage: { ... } // or imageMessage: { ... }\n }\n }\n}\n```\n\nKey points:\n- The `parentMessageKey.id` links back to the main message (the image/video they accompany)\n- When `parentMessageKey.fromMe: true`, the companion media is replying to a message Vincent sent\n- Captions may appear on the child's `videoMessage.caption` or `imageMessage.caption` inside the nested message, NOT at the top level\n- These often contain motion-photo short videos (3 seconds) or album companion images\n- Treat them as media attachments to the parent; don't process them as standalone messages requiring independent intent analysis\n\n### Sequential Image Sets (Non-Album)\nMultiple images sent rapidly by the same sender at similar timestamps may NOT be wrapped in an `albumMessage`. They appear as separate `imageMessage` entries. Always check for adjacent images from the same sender and download+OCR all of them together, as they often form a coherent multi-page set (e.g., route info, comparison tables, maps). Missing one can lose important context.\n\n### Key pitfalls\n- The Evolution API `getBase64FromMediaMessage` endpoint returns the full base64 payload (can be 100KB+ for images). Use `-o` flag to save to file, then parse with Python. Do NOT try to pipe it directly\n- `curl` with `--max-time 25` is essential; without it, large media responses can hang\n- The `albumMessage` type itself has no `fromMe` field; check each individual `imageMessage` instead\n- Image thumbnails in the webhook payload are truncated; always download the full image for analysis\n- **WhatsApp media URLs expire**: CDN download links (`mmg.whatsapp.net`) expire after roughly 24-48 hours. When `getBase64FromMediaMessage` returns a 400/404 error like \"Failed to fetch stream from https://mmg.whatsapp.net/...\", the media is no longer retrievable via the API. This means you CANNOT download adjacent/older voice notes, images, or documents for context if they're more than a day or two old. Fall back to conversation text context only. Fresh/incoming messages are almost always still downloadable.\n- **Voice note transcription reliability**: Short voice notes (<15s), especially in Singlish/Chinese-English mix common in Singapore contacts, often produce unreliable transcriptions via SpeechRecognition+Google. Always flag low-confidence transcriptions to Vincent rather than acting on potentially garbled text.\n- **Vision API failures**: `vision_analyze` can fail with 429 (rate limit) **or 400** (model doesn't support image input, e.g. glm-5.1 via ollama-cloud). Always have a fallback plan. When vision is unavailable, try tesseract OCR, then pixel analysis (PIL/numpy for brightness zones, text region detection via row-level dark-pixel density)\n- **OCR can fail on decorative/complex images**: Obituaries, memorial notices, religious flyers with ornate fonts, or dark-background images often defeat tesseract even with aggressive preprocessing (upscaling, thresholding, inverting, PSM mode variations). When OCR returns gibberish or empty, fall back to conversation context inference: use preceding/surrounding messages to determine what the image likely contains, then draft an appropriate response based on that context\n- **OCR on devotional/dark-background images**: These often defeat tesseract on full-image scans. Use a multi-strategy approach: (1) run tesseract on the full image first, (2) if text is sparse, crop into regions (top 15%, middle, bottom 15%), upscale 2-3x, enhance contrast (2-3x), and try both normal and inverted (`ImageOps.invert`) versions with `--psm 6`, (3) combine all partial OCR results. For Exodus/verse images, the title (e.g. \"WHEN MOSES' HANDS WERE\") is often in decorative font that OCR misses, while the verse body text is usually readable.\n- **Prioritize conversation context**: When processing group images, always fetch recent context FIRST (before attempting vision/OCR). If image analysis fails, the conversation history often makes the image's purpose clear enough to draft a meaningful reply\n- **DM conversation context is one-sided (fromMe only)**: The Evolution API `findMessages` endpoint often returns only Vincent's sent messages (`fromMe: true`) for DM chats, not the contact's incoming messages. This means `wa_recent_messages.py` can show only one side of the conversation. When processing a DM webhook, do NOT rely solely on fetched history for the contact's side. The webhook payload already contains the incoming message text. Use what the webhook gives you plus Vincent's side from the API. If you need the contact's recent messages, check if `fromMe: false` records exist; if none are returned, the API simply doesn't have them and you must work with what you have (webhook message + Vincent's replies for context).\n- **`findMessages` ignores `key.fromMe` filter**: Passing `\"key.fromMe\": false` in the `where` clause still returns all messages (including `fromMe: true`). To find messages from a specific sender, fetch results and filter client-side by checking `fromMe` in Python.\n- **`findMessages` caps at ~50 records per request**: Even with `limit: 200` or `limit: 500`, the API returns at most 50 records. When one party sends many consecutive messages, the other party's messages may not appear in the first 50 results. Use pagination or fetch multiple pages if you need older messages.\n- **Video frame extraction**: Use `ffmpeg -y -ss N -i video.mp4 -frames:v 1 -update 1 output.jpg` (the `-update 1` flag is needed; without it ffmpeg treats the output as an image sequence and fails). Extract frames at multiple timestamps (start, ~1/3, midpoint for longer videos) since a single frame may miss the key content.\n- **ffmpeg \"Invalid color space\" failure on WhatsApp videos**: Some WhatsApp H.264 videos have `color_space=reserved` and `color_primaries=reserved` metadata that ffmpeg cannot decode at all. Every codec/format/filter combination fails with \"Invalid color space\" / return code 234. This is NOT fixable with `-vf setparams`, `-color_trc`, `-colorspace`, `-color_primaries`, or `-vf format=` flags because the error occurs at the decoder level before any filter runs. **Do not waste time trying ffmpeg workarounds.** When this happens, fall back to: (1) conversation context inference (who sent it, what thread it's in, adjacent messages), (2) the jpegThumbnail embedded in the videoMessage (small but gives a rough idea; base64-encoded JPEG, typically 600-900 bytes, use `vision_analyze` on it if available), or (3) if all analysis fails, just classify based on surrounding text context.\n- **`wa_recent_messages.py` JSON output breaks `json.loads()`**: The script's stdout contains WhatsApp message content with control characters (emoji, CJK, deviceListMetadata binary blobs) that cause `json.loads()` to fail with `Invalid control character` or `Expecting ',' delimiter` errors. Even `strict=False` does not reliably fix it. **Reliable approach**: use `curl -o /tmp/wa_msgs.json` to save the Evolution API response directly to a file (the file write handles binary safely), then parse with `json.load(open('/tmp/wa_msgs.json'))` in Python. Example:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 15 -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"CONTACT_JID\"}}, \"limit\": 20}' -o /tmp/wa_msgs.json\n ```\n Then in Python: `data = json.load(open('/tmp/wa_msgs.json')); records = data['messages']['records']`\n\n## API Reference\n\nFor full API endpoints (media, contacts, location, buttons) — see [references/api.md](references/api.md).\n\n[Skill directory: /opt/data/skills/personal/whatsapp-evolution]\nResolve any relative paths in this skill (e.g. `scripts/foo.js`, `templates/config.yaml`) against that directory, then run them with the terminal tool using the absolute path.\n\n[This skill has supporting files:]\n- references/api.md -> /opt/data/skills/personal/whatsapp-evolution/references/api.md\n- scripts/process_queue.py -> /opt/data/skills/personal/whatsapp-evolution/scripts/process_queue.py\n- scripts/send_text.sh -> /opt/data/skills/personal/whatsapp-evolution/scripts/send_text.sh\n\nLoad any of these with skill_view(name=\"personal/whatsapp-evolution\", file_path=\"\"), or run scripts directly by absolute path (e.g. `node /opt/data/skills/personal/whatsapp-evolution/scripts/foo.js`).\n\nThe user has provided the following instruction alongside the skill invocation: Incoming WhatsApp webhook received.\n\nEvent: messages.upsert\nInstance: n8n-WA\nSender: Jocelyn\nSender JID: 120363418226875471@g.us\nFrom me: False\nMessage type: reactionMessage\nMessage: {\n \"event\": \"messages.upsert\",\n \"instance\": \"n8n-WA\",\n \"data\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": false,\n \"id\": \"3A34DD2DC662EC2E0785\",\n \"participant\": \"192534810767573@lid\",\n \"participantAlt\": \"6594245364@s.whatsapp.net\",\n \"addressingMode\": \"lid\"\n },\n \"pushName\": \"Jocelyn\",\n \"status\": \"DELIVERY_ACK\",\n \"message\": {\n \"reactionMessage\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": true,\n \"id\": \"AC37643983DC88906B4D0F1A09F74082\",\n \"participant\": \"54644919296042@lid\"\n },\n \"text\": \"\\u2764\\ufe0f\",\n \"senderTimestampMs\": {\n \"low\": -1597606782,\n \"high\": 414,\n \"unsigned\": false\n }\n }\n },\n \"messageType\": \"reactionMessage\",\n \"messageTimestamp\": 1780813821,\n \"instanceId\": \"b2bfaa30-d5f1-4b22-b979-791892619414\",\n \"source\": \"ios\"\n },\n \"destination\": \"https://jimmy.huper.asia/webhooks/whatsapp-incoming\",\n \"date_time\": \"2026-06-07T03:30:22.042Z\",\n \"sender\": \"6590054438@s.whatsapp.net\",\n \"server_url\": \"https://evolution.huper.asia\",\n \"apikey\": \"5E4C532223A0-460B-A749-16C584F56E74\"\n}\n\nPrimary goal: REDUCE NOISE. Stay silent in the main Telegram chat unless the message truly needs Vincent's immediate attention. Approval-worthy reply drafts should go to the jimmy-wa-drafts sidecar, NOT to the main Telegram chat.\n\nFollow the whatsapp-evolution skill and /opt/data/agent-state/memory/whatsapp-vip-contacts.json, with these stricter notification rules:\n\nABSOLUTE SILENCE RULES:\n- If fromMe is true: ignore. Do NOT call send_message. Final response local log only.\n- Ignore status/update/read-receipt events or payloads with no real message content.\n- Do NOT send main Telegram summaries for ordinary DMs, routine logistics, casual family messages, generic group events, group chatter, forwards, memes, devotional/religious shares, links, standalone codes, normal confirmations, or suggested replies.\n- Do NOT ask \"Want me to add this to Calendar?\" for WhatsApp group events unless Vincent is explicitly invited, named, responsible, or the event is time-sensitive and clearly relevant to him today.\n- Do NOT surface family messages just because they are family. Only surface if urgent, safety-related, school/pickup critical, money/legal/medical, or explicitly requires Vincent now.\n\nSIDEcar RULES:\n- If a reply may be useful but it is not urgent, POST it to https://jimmy-draft.huper.asia/drafts using the whatsapp-evolution Draft Approvals flow. Do NOT duplicate that draft/summary in the main Telegram chat.\n- Prayer and birthday replies should go to the sidecar when appropriate, not the main Telegram chat, unless the underlying situation is urgent/sensitive and Vincent needs immediate awareness.\n- Before drafting any reply or approval suggestion, fetch recent context: python3 /opt/data/scripts/wa_recent_messages.py \"120363418226875471@g.us\" 30. For active/ambiguous groups use 50.\n\nMAIN TELEGRAM ALERTS ARE ALLOWED ONLY FOR:\n1. VIP contacts marked always_call or always_notify when the message is important/actionable, or any VIP call event.\n2. Calls from family/VIP that appear missed/ringing and likely need immediate attention. Avoid repeated alerts for accepted/terminated duplicate call state updates.\n3. Emergencies, safety issues, medical issues, school pickup/child logistics requiring action now, legal/money/account/security matters, or time-sensitive commitments in the next few hours.\n4. Messages explicitly asking Vincent to call back, decide now, approve payment, provide OTP/code for an active task, or respond urgently.\n5. Pipeline errors or webhook processing failures.\n\nCALL RULES:\n- VIP contacts (e.g. Dn Jeff, Ng Whye Keong): call Vincent immediately outside quiet hours when actionably important or on ringing/missed call. During 1am-10am SGT, Telegram only unless life-and-death emergency.\n- Do not call for casual VIP confirmations or non-actionable courtesy messages.\n\nTONE/RULES FOR DRAFTS:\n- Proposed replies must be Christian and grace-based where appropriate: warm, humble, encouraging, non-judgmental, peace-making, not preachy.\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- NEVER auto-reply directly on WhatsApp unless Vincent explicitly approves or future config marks that exact category pre-approved.\n\nDelivery rule:\n- For allowed main alerts only: call send_message(target=\"telegram\", message=...). Keep it concise.\n- For non-urgent reply suggestions: use the sidecar only.\n- For everything else: DO NOT call send_message. Keep final response as local-only log.\n", "timestamp": 1780813827.1937604, "anchor": true}, {"id": 104132, "role": "assistant", "content": "**Local log only. No action required.**\n\n**Reason**: This is a `reactionMessage` (❤️ emoji) from Jocelyn reacting to a message Vincent previously sent in group `120363418226875471@g.us`. Reaction messages are social gestures with no substantive content requiring response or notification. Falls under \"status/update/read-receipt events or payloads with no real message content\" — silently ignored per the noise reduction rules.", "timestamp": 1780813827.2797387}], "bookend_end": [], "messages_before": 0, "messages_after": 2}, {"session_id": "20260607_063007_b039f88f", "when": "June 07, 2026 at 06:30 AM", "source": "webhook", "model": "glm-5.1", "title": "WhatsApp Message Filtering and Processing", "matched_role": "user", "match_message_id": 104128, "snippet": "...true`, they are replying to a >>>message<<< Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an >>>incoming<<< WhatsApp >>>message<<< is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification...", "bookend_start": [], "messages": [{"id": 104128, "role": "user", "content": "[IMPORTANT: The user has invoked the \"whatsapp-evolution\" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]\n\n---\nname: whatsapp-evolution\ndescription: WhatsApp messaging via Evolution API — send, receive, analyze, and auto-reply. Use when processing an incoming WhatsApp message delivered via the webhook, replying on Vincent's behalf, liaising with contacts, sending messages, or handling WhatsApp-related tasks. Covers intent analysis, auto-reply rules, escalation (VIP → phone call), conversation context fetching, and draft approvals via the jimmy-wa-drafts sidecar.\n---\n\n# WhatsApp via Evolution API\n\n## Credentials\nStored in `/opt/data/secrets/.whatsapp-evolution-credentials`:\n- `EVOLUTION_API_URL`, `EVOLUTION_API_KEY`, `EVOLUTION_INSTANCE`\n\nLoad with: `export $(cat .whatsapp-evolution-credentials | xargs)`\n\n## Sending Messages (direct, no approval)\n\n```bash\nexport $(cat .whatsapp-evolution-credentials | xargs)\ncurl -s -X POST \"${EVOLUTION_API_URL}/message/sendText/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"number\": \"6591234567\", \"text\": \"Hello!\"}'\n```\n\n- Phone format: country code, no `+` (e.g. `6590054438`)\n- Groups: use JID `{groupid}@g.us`\n- For media/location/contacts — see [references/api.md](references/api.md)\n\nUse direct send only when Vincent has explicitly commanded/approved the exact send, or for pre-approved auto-replies/follow-ups. For any reply that needs Vincent's review, use the sidecar (next section).\n\n### Direct replies when Vincent names an existing WhatsApp thread\n\nIf Vincent says to “reply to X” or “confirm with X” and has clearly approved the action, but the contact JID/number is not in the current prompt:\n\n1. Search recent session history first (`session_search`) for the contact name, clinic/business name, or appointment details. Prior webhook summaries often contain the exact `remoteJid`, alternate phone JID, sender name, and context.\n2. If session history identifies the WhatsApp phone JID, send directly only if Vincent’s instruction is explicit enough to act. Keep the text concise and natural.\n3. If only a LID JID is found, look for `remoteJidAlt`/alternate phone JID in the session summary or fetch contacts/chats before sending.\n4. Do not invent phone numbers from web search when replying to an existing WhatsApp conversation. Use web search only as supporting context, not as the destination, unless Vincent explicitly asks to contact a public number.\n\n\n### Sending trip/camp details from retrieved emails\n\nWhen Vincent explicitly asks to send extracted school trip/camp details to a child or family member, send directly. Keep it practical and grounded:\n- Include dates, reporting/pick-up times, flights, venue/itinerary highlights, deadlines, forms, and safety/medication notes when found.\n- If an official packing list is missing, say that clearly. Do not invent one. You may add a short “based on itinerary/weather, prepare…” list, clearly labeled as practical guidance rather than official school instructions.\n- Avoid long explanations about the search process in WhatsApp. Send the useful details.\n\n### Sending a local image/file as media\n\nFor user-approved media sends, base64-encode the local file and call `sendMedia`:\n\n```bash\nexport $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\npython3 - <<'PY'\nimport base64, json\npath = '/tmp/file.png'\npayload = {\n 'number': '6591234567',\n 'mediatype': 'image',\n 'mimetype': 'image/png',\n 'caption': 'Caption here',\n 'media': base64.b64encode(open(path, 'rb').read()).decode(),\n 'fileName': 'file.png',\n}\nopen('/tmp/wa_media_payload.json', 'w').write(json.dumps(payload))\nPY\ncurl -s --max-time 60 -X POST \"${EVOLUTION_API_URL}/message/sendMedia/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_media_payload.json\n```\n\nWhen sending QR tickets or passes, verify the crop/image contains the full QR code and the intended attendee name before sending; the QR screenshot is effectively the redeemable ticket.\n\n## Fetching Conversation Context\n\nBefore drafting or proposing any reply, always fetch recent chat history for context:\n\n```bash\npython3 /opt/data/scripts/wa_recent_messages.py \"6591234567@s.whatsapp.net\" 30\n```\n\nUse `limit=30` by default. For active group chats or ambiguous situations, fetch up to 50 messages. Read enough context to understand who is being discussed, current tone, previous commitments, and whether Vincent has already replied. The script returns newest messages first (descending order).\n\n## Draft Approvals — `jimmy-wa-drafts` sidecar\n\nWhen you need Vincent to approve a WhatsApp reply, POST the draft to the sidecar. It owns the entire approval flow: Telegram message with 4 inline buttons, Telegram WebApp for Edit/Context, sending to WhatsApp via Evolution API on Approve.\n\n### Posting a draft\n\nAlways include an `extras` object with the original WA context. The sidecar passes `extras` through verbatim if Vincent later clicks \"Context\" — without it, your redraft will be flying blind.\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\n \"contact\": \"120363351787895353@g.us\",\n \"contact_name\": \"Cycling group\",\n \"summary\": \"Karen sent a blessing over the bike ride; she's not joining\",\n \"draft_text\": \"Thanks Karen, will miss you both — praying for a good time too 🙏\",\n \"extras\": {\n \"group_jid\": \"120363351787895353@g.us\",\n \"sender_name\": \"Karen Tan\",\n \"sender_jid\": \"6597630553@s.whatsapp.net\",\n \"category\": \"prayer/blessing\",\n \"original_message\": \"Happy cycling, to those who can make it… Dn Jeff and I had made plans for this PM so we won't be able to join y'all. Praying the Lord's favor, protection and good-weather blessings be upon y'all and your route!\",\n \"extracted\": {\n \"what\": \"Cycling from East Coast Park to Changi Village, then dinner\",\n \"when\": \"Today, 4:00 PM\",\n \"where\": \"East Coast Park to Changi Village\",\n \"who\": \"Group members who voted yes for cycling/dinner or dinner\"\n },\n \"confidence\": \"high\",\n \"why_appropriate\": \"Multiple group messages are blessings/wishes; Vincent has not replied; tone matches grace-based norm.\",\n \"wa_message_id\": \"ABC123XYZ\"\n }\n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\nResponse: `{\"draft_id\":\"abc123def\",\"telegram_message_id\":42}`\n\nWhat to put in `extras` (use whatever makes sense for the message type):\n\n- **Always**: `original_message` (verbatim WA text), `sender_name`, `category` (e.g. `prayer`, `birthday`, `event`, `dm-reply`, `enquiry`)\n- **Group messages**: `group_jid`, `sender_jid`\n- **DMs**: `sender_jid`\n- **Strongly recommended (drives the rich Telegram render)**: `extracted` (dict of who/what/when/where), `confidence` (`high`/`medium`/`low`), `why_appropriate` (one-line reasoning shown to Vincent so he can decide)\n- **Optional**: `wa_message_id`, `tone_hint` (e.g. `warm-grace-based`), `sender_phone` (auto-derived from `sender_jid` if omitted)\n\nVincent receives a Telegram message in the `@jimmy_WA_bot` chat with 4 inline buttons. The sidecar handles every interaction; **you do not track replies, callbacks, or message IDs**. Just POST and move on.\n\n| Button | Behavior |\n|--------|----------|\n| **Approve** | Sidecar sends draft to WhatsApp |\n| **Edit** | Opens a Telegram WebApp with draft pre-filled; Vincent edits + sends; sidecar pushes to WhatsApp |\n| **Context** | Opens a Telegram WebApp with empty textarea + the original WA message visible; Vincent provides context. (When `HERMES_REDRAFT_WEBHOOK` is wired in the sidecar's env, this will POST back to Jimmy for a redraft. Until then, the context is just stored on the draft.) |\n| **Pass** | Sidecar deletes the draft |\n\n### JSON escaping when `draft_text` contains quotes\n\nSingle quotes (`It's`, `God's`) break shell interpolation. Write the JSON to a temp file first:\n\n```bash\ncat > /tmp/wa_draft.json <<'JSON'\n{\"contact\":\"6591234567@s.whatsapp.net\",\"contact_name\":\"John\",\"summary\":\"…\",\"draft_text\":\"It's nice to hear from you!\"}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_draft.json\n```\n\n### Networking\n\n`https://jimmy-draft.huper.asia` is the public HTTPS endpoint of the sidecar (TLS-terminated by Traefik on ubuntu-server). HTTPS is required because Hermes blocks plain-HTTP commands via tirith. The internal Docker DNS hostname http://jimmy-wa-v64mqe:8080 also works for non-shell calls but the agent is forbidden from using it.\n\n### Sidecar status & logs\n\n- Source: (auto-deploys on push)\n- Public URL: (`/health` for ping)\n- Container: `jimmy-wa-v64mqe.1.` on ubuntu-server\n- Bot: `@jimmy_WA_bot`\n\n## Handling a Redraft Trigger (`wa-redraft` webhook)\n\nWhen Vincent clicks the **Context** button on a draft, the sidecar:\n1. Edits the original Telegram message to \"🔄 Adding context for X… agent is redrafting.\" (removes the old buttons so Vincent can't approve the now-stale text)\n2. POSTs to your `wa-redraft` webhook (`http://localhost:8644/webhooks/wa-redraft`) with the original draft fields plus Vincent's `context` instructions and the `extras` you stored when you first created the draft.\n\nWhen you receive a `wa-redraft` activation, the prompt template will lay out the fields. Your job:\n\n1. Read Vincent's `context` carefully — that's the steering signal. The `previous_draft` is just history, not necessarily a starting point.\n2. Re-fetch recent WA history if the original context isn't enough (`python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`). Use `extras.group_jid` for groups or `contact` for DMs.\n3. Write a NEW draft that addresses Vincent's instructions while staying faithful to the original WA context (in `extras.original_message`, `extras.extracted`).\n4. POST the new draft back to the sidecar **with the same `draft_id`** so the existing draft is updated, not duplicated:\n\n```bash\ncat > /tmp/wa_redraft.json <\",\n \"contact\": \"\",\n \"contact_name\": \"\",\n \"summary\": \"\",\n \"draft_text\": \"\",\n \"extras\": \n}\nJSON\ncurl -s -X POST https://jimmy-draft.huper.asia/drafts \\\n -H 'Content-Type: application/json' \\\n -d @/tmp/wa_redraft.json\n```\n\nThe sidecar will send a NEW Telegram message with 4 fresh buttons. Vincent can Approve, Edit, Context-again (loop), or Pass.\n\nDo **not** send to WhatsApp directly during a redraft — the approval flow is the sidecar's job.\n\n## Incoming Message Processing\n\n### Delivery Pipeline (current)\n1. Evolution API POSTs each `MESSAGES_UPSERT` webhook directly to Hermes at `http://localhost:8644/webhooks/whatsapp-incoming` (proxied via `jimmy.huper.asia/webhooks/whatsapp-incoming`)\n2. Hermes webhook route (`whatsapp-incoming`, dynamic subscription) validates and spawns an agent\n3. Agent loads `whatsapp-evolution` skill, follows the rules below\n4. Webhook subscription uses `deliver: log` because webhooks do not support `deliver: local`; the agent explicitly calls `send_message(target=\"telegram\", ...)` only for important/actionable/VIP/event/birthday/prayer/error items, OR posts a draft to the sidecar for approval items.\n\n**No more OpenClaw, no Python webhook server, no port 5001, no keepalive cron.** If messages aren't being processed, check: nginx proxy to port 8644, `hermes webhook list` for the subscription, Evolution API webhook URL setting.\n\n## Contact Rules & Escalation\n\nConfig: `memory/whatsapp-vip-contacts.json`\n\n### Priority Levels\n\n| Priority | Action | Example |\n|----------|--------|---------|\n| **VIP (call-always)** | **CALL Vincent immediately via Twilio without asking.** Do NOT ask for approval first. During quiet hours (1am-10am SGT) suppress call and use Telegram notification only, unless life-and-death emergency. Match VIPs by both phone JID (`659...@s.whatsapp.net`) and LID JID (`...@lid`) when present, because recent chats may arrive with `key.remoteJid` as an LID and the phone number only in `key.remoteJidAlt`. | Dn Jeff (+6597630553) |\n| **Family** | Always notify Vincent on Telegram | ps112 group, family members |\n| **Auto-reply** | Reply based on context file, notify Vincent | Repair shops |\n| **Normal** | Analyze intent, reply if appropriate, notify if important | Everyone else |\n| **Broadcast/Spam** | Ignore silently | Forwarded job posts, chain messages |\n\n### Quiet Hours\n- 1am-10am SGT (17:00-02:00 UTC): avoid phone calls, including for Dn Jeff / VIP call-always contacts\n- **Exception:** life-and-death emergencies always call regardless of quiet hours\n- During quiet hours, use Telegram notification only and mention that a call was suppressed due to quiet hours\n\n## Making a Call via Twilio\n\nCredentials: `/opt/data/secrets/.twilio-credentials-new` (`TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN`, `TWILIO_PHONE_NUMBER`). From: `+16614809287`, Vincent: `+6590054438`.\n\nPreferred script: `/opt/data/scripts/call_vincent.py`, now using the Twilio Python SDK baked into the container.\n\n```bash\npython3 /opt/data/scripts/call_vincent.py \"Hello Vincent, this is Jimmy. Urgent summary here.\"\n```\n\n## Action Rules\n\n### Default: DO NOT auto-reply directly, and do not spam main Telegram\nCollect intent and stay local unless something truly needs Vincent's immediate attention.\n\n**When you need Vincent to approve a reply, POST a draft to the sidecar (see \"Draft Approvals\" above).** The sidecar handles the Telegram notification with buttons, the WebApp for editing, and sending to WhatsApp on approve. Do **not** duplicate that draft or summary in the main Telegram chat.\n\nMain Telegram WhatsApp summaries are muted by default. Only use `send_message(target=\"telegram\", ...)` for urgent/important/actionable items that need Vincent now: VIP actionable messages/calls, missed/ringing family calls, emergencies, safety/medical/school-pickup issues, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors. Routine logistics, casual family pings, generic group events, links, standalone codes, normal confirmations, and suggested replies should be local-only or sidecar-only.\n\nDo NOT send WhatsApp replies directly unless Vincent explicitly approves them or the category is pre-approved.\n\n### Tone (when proposing or sending replies)\n- Casual, direct, no sign-offs (no \"Thanks & Regards\", no \"- Vincent\")\n- Ask questions instead of giving instructions\n- Keep it natural — like texting a friend\n- Christian and grace-based: warm, humble, encouraging, non-judgmental, peace-making\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- For prayer/birthday contexts, use gentle faith-filled language; avoid sounding preachy, performative, or overly formal\n\n### What to Do With Each Message\n\n**DMs (personal chats):**\n- Analyze intent\n- Notify Vincent on Telegram (via `send_message(target=\"telegram\", ...)`) only if important/actionable/time-sensitive, requires a decision, comes from a VIP contact\n- If you have a useful suggested reply, POST it as a draft to the sidecar instead of a Telegram notification — Vincent gets the buttons inline\n- Casual/social/non-actionable DMs → local log only, no Telegram notification, no draft\n- Do NOT reply unless Vincent says to\n- VIP contacts → **call Vincent immediately without asking**. Do NOT post a draft. Just call. During quiet hours (1am-10am SGT), suppress the call and send Telegram notification only, unless life-and-death emergency.\n\n**Group chats — events/meetups:**\n- If someone posts about an event, gathering, meetup, dinner, etc:\n 1. Extract: what, when, where, who\n 2. Notify Vincent on Telegram with details (no draft needed unless you also want to suggest an RSVP reply)\n 3. Ask: \"Want me to add this to Google Calendar and set a reminder?\"\n\n**Group chats — prayer requests / praying for others:**\n- If someone asks for prayer, shares a burden, illness, grief, struggle, thanksgiving, or asks the group to pray:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Identify who/what to pray for and any sensitive details\n 3. Draft a short, grace-based Christian reply/prayer\n 4. POST it to the sidecar with `summary` describing the request and `draft_text` being the prayer\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — birthday greetings:**\n- If you see birthday wishes for someone in a group:\n 1. Fetch recent group context first: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n 2. Note who the birthday person is and relationship/context if inferable\n 3. Draft a warm, grace-based Christian birthday greeting\n 4. POST it to the sidecar\n 5. Do NOT post to WhatsApp unless Vincent approves\n\n**Group chats — general:**\n- Casual banter, memes, forwards, broadcast spam → ignore silently\n- Religious content shares → ignore unless specifically addressed to Vincent\n\n### When to Notify Vincent (Telegram) vs Post a Draft\n\n**Use `send_message(target=\"telegram\", ...)` sparingly** for plain notifications only when the message needs Vincent's immediate attention: VIP actionable DMs/calls, emergencies, school pickup/child logistics requiring action now, money/legal/account/security matters, explicit call-back/decision-now requests, or pipeline errors.\n\n**Use the sidecar (POST `/drafts`)** whenever you have a suggested reply for Vincent to approve: prayer responses, birthday greetings, DM replies, group RSVPs. Do not also send the same summary to the main Telegram chat.\n\nDo **not** notify or draft for casual/social/non-actionable DMs, routine logistics, read receipts/status updates, religious shares not addressed to Vincent, memes, forwarded/broadcast/spam, generic group events, standalone codes, normal confirmations, or general group chatter. Family messages are not automatically main-chat alerts unless urgent, safety-related, school/pickup critical, or explicitly requiring Vincent now.\n\n## Intent Categories\n\nWhen analyzing incoming messages, classify as:\n\n1. **Urgent** — time-sensitive, emergencies → call Vincent outside quiet hours; during quiet hours use Telegram only unless life-and-death\n2. **Important** — decisions, money, appointments, anything consequential → call Vincent outside quiet hours; during quiet hours notify on Telegram only unless life-threatening\n3. **Request** — asking Vincent to do something → **notify Vincent** (Telegram) or post a draft if you have a safe reply\n4. **Enquiry** — asking for info, pricing, availability → auto-reply if context available, otherwise post a draft\n5. **Confirmation** — confirming plans, bookings → acknowledge and notify Vincent\n6. **Casual/Social** — greetings, memes, forwards → usually ignore or react\n7. **Broadcast/Spam** — forwarded job posts, chain messages → ignore\n\n## Finding Individual Contacts and JIDs\n\nWhen Vincent asks to monitor or prioritize a specific WhatsApp contact and only gives a name:\n\n1. Query contacts and chats, then search both display names and serialized message metadata:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n python3 - <<'PY'\n import os, requests, json\n base=os.environ['EVOLUTION_API_URL']; inst=os.environ['EVOLUTION_INSTANCE']; key=os.environ['EVOLUTION_API_KEY']\n headers={'apikey':key,'Content-Type':'application/json'}\n terms=['whye','keong'] # replace with lowercase search terms\n for ep in ['findContacts','findChats']:\n r=requests.post(f'{base}/chat/{ep}/{inst}', headers=headers, json={}, timeout=30)\n print('endpoint', ep, 'status', r.status_code)\n for obj in (r.json() if r.ok else []):\n if any(t in json.dumps(obj, ensure_ascii=False).lower() for t in terms):\n print(json.dumps(obj, ensure_ascii=False)[:1000])\n PY\n ```\n2. If `findChats` shows `remoteJid` ending in `@lid`, inspect `lastMessage.key.remoteJidAlt` for the phone JID. Store both in `memory/whatsapp-vip-contacts.json` when configuring VIP/call rules.\n3. Verify the number exists with `/chat/whatsappNumbers/{instance}` before finalizing.\n4. Validate the JSON config after editing: `python3 -m json.tool /opt/data/agent-state/memory/whatsapp-vip-contacts.json >/tmp/wa_vip_validated.json`.\n\n## Finding Group JIDs\n\nThe `group/fetchGroups` endpoint returns 404 on some Evolution API instances (including ours). To find a group JID when you don't have it:\n\n1. **Use `chat/findMessages` with an empty `where` filter** to pull recent messages across all chats, then extract unique `remoteJid` values containing `@g.us`.\n2. **Match by content**: Fetch messages from each group JID using `where.key.remoteJid` and match sender names or message text to identify the target group.\n3. **Known group JIDs** (store in `whatsapp-vip-contacts.json` as you discover them):\n - ps112 family group: `6591887488-1464061170@g.us`\n - Overcomers group: `6591444911-1614821287@g.us`\n - 神的华美 group: `6596756578-1405873840@g.us` (content-sharing, casual/broadcast, local log only)\n - Overcomers/TCians group: `120363230236878544@g.us` (church fellowship, PJ sends weekly attendance reports, routine attendance = local log only)\n\nExample search:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"6591444911-1614821287@g.us\"}}, \"limit\": 10, \"order\": \"desc\"}'\n```\n\n## Processing Image/Media Messages\n\nWhen an incoming WhatsApp message is an image (imageMessage), video, or other media with no text or only a caption:\n\n1. **Fetch conversation context** as usual with `wa_recent_messages.py`\n2. **Download the media** using the Evolution API:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"message\": {\"key\": {\"id\": \"MESSAGE_ID\", \"fromMe\": false, \"remoteJid\": \"GROUP_OR_CHAT_JID\"}}, \"convertToMp4\": false}' \\\n -o /tmp/wa_media_response.json\n ```\n Then decode: `python3 -c \"import json,base64; d=json.load(open('/tmp/wa_media_response.json')); open('/tmp/wa_media.jpg','wb').write(base64.b64decode(d['base64']))\"`\n3. **Analyze the image** using `vision_analyze` tool with `/tmp/wa_media.jpg` path and a context-aware question\n4. **Fallback if vision is unavailable**: Use `tesseract /tmp/wa_media.jpg stdout` for OCR, combined with pixel analysis (PIL/numpy) and conversation context to infer what the image contains\n5. **Album messages** (`albumMessage`): These bundle multiple images. Each image arrives as a separate `imageMessage` with a `messageAssociation.parentMessageKey` linking them to the album. Download each image separately. The `expectedImageCount` field tells how many images to expect\n6. **messageAssociation**: When images have `parentMessageKey` with `fromMe: true`, they are replying to a message Vincent sent. Use this for context\n\n### Audio / Voice Note Messages\nWhen an incoming WhatsApp message is `audioMessage` or a voice note (`ptt: true`), treat it like other media, but transcribe before intent classification when possible:\n\n1. **Fetch conversation context first**, especially in groups:\n ```bash\n python3 /opt/data/scripts/wa_recent_messages.py \"\" 50\n ```\n2. **Download the audio** with Evolution API, saving the response to a file:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_audio_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"GROUP_OR_CHAT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_audio_req.json -o /tmp/wa_audio_response.json\n python3 - <<'PY'\n import json, base64\n d=json.load(open('/tmp/wa_audio_response.json'))\n b=d.get('base64') or d.get('data',{}).get('base64')\n if ',' in b[:100]: b=b.split(',',1)[1]\n open('/tmp/wa_audio.ogg','wb').write(base64.b64decode(b))\n PY\n ```\n3. **Convert OGG/Opus to WAV**:\n ```bash\n ffmpeg -y -i /tmp/wa_audio.ogg -ar 16000 -ac 1 /tmp/wa_audio.wav\n ```\n4. **Transcribe fallback when Whisper is unavailable**: the container may not have `whisper`, `faster-whisper`, `torch`, or `speech_recognition` installed globally. A reliable lightweight fallback is a temp venv with SpeechRecognition and Google web speech:\n ```bash\n python3 -m venv /tmp/srvenv\n /tmp/srvenv/bin/pip install --quiet SpeechRecognition\n /tmp/srvenv/bin/python - <<'PY'\n import speech_recognition as sr\n r=sr.Recognizer()\n with sr.AudioFile('/tmp/wa_audio.wav') as source:\n audio=r.record(source)\n print(r.recognize_google(audio, language='en-SG'))\n PY\n ```\n5. **Check adjacent voice notes from the same sender**. WhatsApp voice notes are often sent in quick sequential chunks. Download and transcribe nearby `audioMessage` records from the same sender before deciding whether to notify Vincent.\n6. Use the transcript plus recent context to classify the message. For group meetup planning, extract what/when/where/who and notify Vincent if actionable, including transcript confidence.\n\n### Video Messages\n\nWhen an incoming WhatsApp message is `videoMessage` (with or without a caption):\n\n1. **Fetch conversation context** first, as with any media message: `python3 /opt/data/scripts/wa_recent_messages.py \"\" 30`\n2. **Download the video** using Evolution API (same pattern as images/audio — save response to file, then decode):\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n cat > /tmp/wa_video_req.json <<'JSON'\n {\"message\":{\"key\":{\"id\":\"MESSAGE_ID\",\"fromMe\":false,\"remoteJid\":\"CONTACT_JID\"}},\"convertToMp4\":false}\n JSON\n curl -s --max-time 25 -X POST \"${EVOLUTION_API_URL}/chat/getBase64FromMediaMessage/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d @/tmp/wa_video_req.json -o /tmp/wa_video_response.json\n python3 - <<'PY'\n import json, base64\n d = json.load(open('/tmp/wa_video_response.json'))\n b = d.get('base64') or d.get('data', {}).get('base64', '')\n if ',' in b[:100]: b = b.split(',', 1)[1]\n open('/tmp/wa_video.mp4', 'wb').write(base64.b64decode(b))\n print(f\"Video saved: {len(base64.b64decode(b))} bytes\")\n PY\n ```\n3. **Extract representative frames** with ffmpeg (do NOT try to analyze video directly — vision tools work on images):\n ```bash\n # Frame at start\n ffmpeg -y -ss 0 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame0.jpg\n # Frame at ~1/3 of the way (good for short videos)\n ffmpeg -y -ss 3 -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame3s.jpg\n # If video is longer (>15s), also extract at midpoint\n # ffprobe -v quiet -print_format json -show_format /tmp/wa_video.mp4 # get duration\n # ffmpeg -y -ss $HALF_DURATION -i /tmp/wa_video.mp4 -frames:v 1 -update 1 /tmp/wa_video_frame_mid.jpg\n ```\n4. **Analyze the frames** with `vision_analyze`. Ask specific questions about content: what objects/people/text are visible, what type of video this appears to be (screen recording, personal clip, forwarded meme, etc.), and any text or identifying details.\n5. **Use the caption** from `videoMessage.caption` if present (can be empty/null). Combine frame analysis + caption + conversation context to classify the message intent.\n6. **Transcription of audio in video** is generally not feasible with available tools. If the video likely contains important spoken content (e.g. voice memo style), note this limitation and flag it to Vincent.\n\n**Key points:**\n- Always use `-o` flag and file-based decode; videos can be 1-10MB and base64 responses will be even larger\n- `--max-time 25` on curl is essential for video downloads\n- Short casual clips (showing a key fob, a parking spot, quick view of surroundings) are usually casual/social — classify accordingly and don't over-escalate\n- If `vision_analyze` fails (400 on models that don't support images, or 429 rate limit), fall back to conversation context inference alone\n\n### Poll Messages\nWhen `pollUpdateMessage` appears in chat history (someone voted), the message contains a `pollCreationMessageKey` with the original poll's message `id`. To get the poll question and options, query:\n```bash\ncurl -s -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"id\": \"POLL_MESSAGE_ID\", \"remoteJid\": \"GROUP_JID\"}}, \"limit\": 1}'\n```\nThe response will have `pollCreationMessageV3` with `name` (the poll question) and `options` array. Always look up the poll when votes or poll references appear, since the poll text contains the event details.\n\n### `associatedChildMessage` (Motion Photos / Companion Media)\nWhatsApp motion photos and companion media arrive as `associatedChildMessage` type. These are short videos or supplementary images attached to a parent message via `messageAssociation.parentMessageKey`. The structure is:\n\n```\nmessage: {\n messageContextInfo: {\n messageAssociation: {\n associationType: 12,\n parentMessageKey: { id, fromMe, remoteJid }\n }\n },\n associatedChildMessage: {\n message: {\n videoMessage: { ... } // or imageMessage: { ... }\n }\n }\n}\n```\n\nKey points:\n- The `parentMessageKey.id` links back to the main message (the image/video they accompany)\n- When `parentMessageKey.fromMe: true`, the companion media is replying to a message Vincent sent\n- Captions may appear on the child's `videoMessage.caption` or `imageMessage.caption` inside the nested message, NOT at the top level\n- These often contain motion-photo short videos (3 seconds) or album companion images\n- Treat them as media attachments to the parent; don't process them as standalone messages requiring independent intent analysis\n\n### Sequential Image Sets (Non-Album)\nMultiple images sent rapidly by the same sender at similar timestamps may NOT be wrapped in an `albumMessage`. They appear as separate `imageMessage` entries. Always check for adjacent images from the same sender and download+OCR all of them together, as they often form a coherent multi-page set (e.g., route info, comparison tables, maps). Missing one can lose important context.\n\n### Key pitfalls\n- The Evolution API `getBase64FromMediaMessage` endpoint returns the full base64 payload (can be 100KB+ for images). Use `-o` flag to save to file, then parse with Python. Do NOT try to pipe it directly\n- `curl` with `--max-time 25` is essential; without it, large media responses can hang\n- The `albumMessage` type itself has no `fromMe` field; check each individual `imageMessage` instead\n- Image thumbnails in the webhook payload are truncated; always download the full image for analysis\n- **WhatsApp media URLs expire**: CDN download links (`mmg.whatsapp.net`) expire after roughly 24-48 hours. When `getBase64FromMediaMessage` returns a 400/404 error like \"Failed to fetch stream from https://mmg.whatsapp.net/...\", the media is no longer retrievable via the API. This means you CANNOT download adjacent/older voice notes, images, or documents for context if they're more than a day or two old. Fall back to conversation text context only. Fresh/incoming messages are almost always still downloadable.\n- **Voice note transcription reliability**: Short voice notes (<15s), especially in Singlish/Chinese-English mix common in Singapore contacts, often produce unreliable transcriptions via SpeechRecognition+Google. Always flag low-confidence transcriptions to Vincent rather than acting on potentially garbled text.\n- **Vision API failures**: `vision_analyze` can fail with 429 (rate limit) **or 400** (model doesn't support image input, e.g. glm-5.1 via ollama-cloud). Always have a fallback plan. When vision is unavailable, try tesseract OCR, then pixel analysis (PIL/numpy for brightness zones, text region detection via row-level dark-pixel density)\n- **OCR can fail on decorative/complex images**: Obituaries, memorial notices, religious flyers with ornate fonts, or dark-background images often defeat tesseract even with aggressive preprocessing (upscaling, thresholding, inverting, PSM mode variations). When OCR returns gibberish or empty, fall back to conversation context inference: use preceding/surrounding messages to determine what the image likely contains, then draft an appropriate response based on that context\n- **OCR on devotional/dark-background images**: These often defeat tesseract on full-image scans. Use a multi-strategy approach: (1) run tesseract on the full image first, (2) if text is sparse, crop into regions (top 15%, middle, bottom 15%), upscale 2-3x, enhance contrast (2-3x), and try both normal and inverted (`ImageOps.invert`) versions with `--psm 6`, (3) combine all partial OCR results. For Exodus/verse images, the title (e.g. \"WHEN MOSES' HANDS WERE\") is often in decorative font that OCR misses, while the verse body text is usually readable.\n- **Prioritize conversation context**: When processing group images, always fetch recent context FIRST (before attempting vision/OCR). If image analysis fails, the conversation history often makes the image's purpose clear enough to draft a meaningful reply\n- **DM conversation context is one-sided (fromMe only)**: The Evolution API `findMessages` endpoint often returns only Vincent's sent messages (`fromMe: true`) for DM chats, not the contact's incoming messages. This means `wa_recent_messages.py` can show only one side of the conversation. When processing a DM webhook, do NOT rely solely on fetched history for the contact's side. The webhook payload already contains the incoming message text. Use what the webhook gives you plus Vincent's side from the API. If you need the contact's recent messages, check if `fromMe: false` records exist; if none are returned, the API simply doesn't have them and you must work with what you have (webhook message + Vincent's replies for context).\n- **`findMessages` ignores `key.fromMe` filter**: Passing `\"key.fromMe\": false` in the `where` clause still returns all messages (including `fromMe: true`). To find messages from a specific sender, fetch results and filter client-side by checking `fromMe` in Python.\n- **`findMessages` caps at ~50 records per request**: Even with `limit: 200` or `limit: 500`, the API returns at most 50 records. When one party sends many consecutive messages, the other party's messages may not appear in the first 50 results. Use pagination or fetch multiple pages if you need older messages.\n- **Video frame extraction**: Use `ffmpeg -y -ss N -i video.mp4 -frames:v 1 -update 1 output.jpg` (the `-update 1` flag is needed; without it ffmpeg treats the output as an image sequence and fails). Extract frames at multiple timestamps (start, ~1/3, midpoint for longer videos) since a single frame may miss the key content.\n- **ffmpeg \"Invalid color space\" failure on WhatsApp videos**: Some WhatsApp H.264 videos have `color_space=reserved` and `color_primaries=reserved` metadata that ffmpeg cannot decode at all. Every codec/format/filter combination fails with \"Invalid color space\" / return code 234. This is NOT fixable with `-vf setparams`, `-color_trc`, `-colorspace`, `-color_primaries`, or `-vf format=` flags because the error occurs at the decoder level before any filter runs. **Do not waste time trying ffmpeg workarounds.** When this happens, fall back to: (1) conversation context inference (who sent it, what thread it's in, adjacent messages), (2) the jpegThumbnail embedded in the videoMessage (small but gives a rough idea; base64-encoded JPEG, typically 600-900 bytes, use `vision_analyze` on it if available), or (3) if all analysis fails, just classify based on surrounding text context.\n- **`wa_recent_messages.py` JSON output breaks `json.loads()`**: The script's stdout contains WhatsApp message content with control characters (emoji, CJK, deviceListMetadata binary blobs) that cause `json.loads()` to fail with `Invalid control character` or `Expecting ',' delimiter` errors. Even `strict=False` does not reliably fix it. **Reliable approach**: use `curl -o /tmp/wa_msgs.json` to save the Evolution API response directly to a file (the file write handles binary safely), then parse with `json.load(open('/tmp/wa_msgs.json'))` in Python. Example:\n ```bash\n export $(cat /opt/data/secrets/.whatsapp-evolution-credentials | xargs)\n curl -s --max-time 15 -X POST \"${EVOLUTION_API_URL}/chat/findMessages/${EVOLUTION_INSTANCE}\" \\\n -H \"apikey: ${EVOLUTION_API_KEY}\" -H \"Content-Type: application/json\" \\\n -d '{\"where\": {\"key\": {\"remoteJid\": \"CONTACT_JID\"}}, \"limit\": 20}' -o /tmp/wa_msgs.json\n ```\n Then in Python: `data = json.load(open('/tmp/wa_msgs.json')); records = data['messages']['records']`\n\n## API Reference\n\nFor full API endpoints (media, contacts, location, buttons) — see [references/api.md](references/api.md).\n\n[Skill directory: /opt/data/skills/personal/whatsapp-evolution]\nResolve any relative paths in this skill (e.g. `scripts/foo.js`, `templates/config.yaml`) against that directory, then run them with the terminal tool using the absolute path.\n\n[This skill has supporting files:]\n- references/api.md -> /opt/data/skills/personal/whatsapp-evolution/references/api.md\n- scripts/process_queue.py -> /opt/data/skills/personal/whatsapp-evolution/scripts/process_queue.py\n- scripts/send_text.sh -> /opt/data/skills/personal/whatsapp-evolution/scripts/send_text.sh\n\nLoad any of these with skill_view(name=\"personal/whatsapp-evolution\", file_path=\"\"), or run scripts directly by absolute path (e.g. `node /opt/data/skills/personal/whatsapp-evolution/scripts/foo.js`).\n\nThe user has provided the following instruction alongside the skill invocation: Incoming WhatsApp webhook received.\n\nEvent: messages.upsert\nInstance: n8n-WA\nSender: Jocelyn\nSender JID: 120363418226875471@g.us\nFrom me: False\nMessage type: reactionMessage\nMessage: {\n \"event\": \"messages.upsert\",\n \"instance\": \"n8n-WA\",\n \"data\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": false,\n \"id\": \"3A29EDCDCAE04D07526B\",\n \"participant\": \"192534810767573@lid\",\n \"participantAlt\": \"6594245364@s.whatsapp.net\",\n \"addressingMode\": \"lid\"\n },\n \"pushName\": \"Jocelyn\",\n \"status\": \"DELIVERY_ACK\",\n \"message\": {\n \"reactionMessage\": {\n \"key\": {\n \"remoteJid\": \"120363418226875471@g.us\",\n \"fromMe\": false,\n \"id\": \"ACAF388AF06A2F0E1C1777A7917ED565\",\n \"participant\": \"207726193995780@lid\"\n },\n \"text\": \"\\u2764\\ufe0f\",\n \"senderTimestampMs\": {\n \"low\": -1597621784,\n \"high\": 414,\n \"unsigned\": false\n }\n }\n },\n \"messageType\": \"reactionMessage\",\n \"messageTimestamp\": 1780813806,\n \"instanceId\": \"b2bfaa30-d5f1-4b22-b979-791892619414\",\n \"source\": \"ios\"\n },\n \"destination\": \"https://jimmy.huper.asia/webhooks/whatsapp-incoming\",\n \"date_time\": \"2026-06-07T03:30:07.154Z\",\n \"sender\": \"6590054438@s.whatsapp.net\",\n \"server_url\": \"https://evolution.huper.asia\",\n \"apikey\": \"5E4C532223A0-460B-A749-16C584F56E74\"\n}\n\nPrimary goal: REDUCE NOISE. Stay silent in the main Telegram chat unless the message truly needs Vincent's immediate attention. Approval-worthy reply drafts should go to the jimmy-wa-drafts sidecar, NOT to the main Telegram chat.\n\nFollow the whatsapp-evolution skill and /opt/data/agent-state/memory/whatsapp-vip-contacts.json, with these stricter notification rules:\n\nABSOLUTE SILENCE RULES:\n- If fromMe is true: ignore. Do NOT call send_message. Final response local log only.\n- Ignore status/update/read-receipt events or payloads with no real message content.\n- Do NOT send main Telegram summaries for ordinary DMs, routine logistics, casual family messages, generic group events, group chatter, forwards, memes, devotional/religious shares, links, standalone codes, normal confirmations, or suggested replies.\n- Do NOT ask \"Want me to add this to Calendar?\" for WhatsApp group events unless Vincent is explicitly invited, named, responsible, or the event is time-sensitive and clearly relevant to him today.\n- Do NOT surface family messages just because they are family. Only surface if urgent, safety-related, school/pickup critical, money/legal/medical, or explicitly requires Vincent now.\n\nSIDEcar RULES:\n- If a reply may be useful but it is not urgent, POST it to https://jimmy-draft.huper.asia/drafts using the whatsapp-evolution Draft Approvals flow. Do NOT duplicate that draft/summary in the main Telegram chat.\n- Prayer and birthday replies should go to the sidecar when appropriate, not the main Telegram chat, unless the underlying situation is urgent/sensitive and Vincent needs immediate awareness.\n- Before drafting any reply or approval suggestion, fetch recent context: python3 /opt/data/scripts/wa_recent_messages.py \"120363418226875471@g.us\" 30. For active/ambiguous groups use 50.\n\nMAIN TELEGRAM ALERTS ARE ALLOWED ONLY FOR:\n1. VIP contacts marked always_call or always_notify when the message is important/actionable, or any VIP call event.\n2. Calls from family/VIP that appear missed/ringing and likely need immediate attention. Avoid repeated alerts for accepted/terminated duplicate call state updates.\n3. Emergencies, safety issues, medical issues, school pickup/child logistics requiring action now, legal/money/account/security matters, or time-sensitive commitments in the next few hours.\n4. Messages explicitly asking Vincent to call back, decide now, approve payment, provide OTP/code for an active task, or respond urgently.\n5. Pipeline errors or webhook processing failures.\n\nCALL RULES:\n- VIP contacts (e.g. Dn Jeff, Ng Whye Keong): call Vincent immediately outside quiet hours when actionably important or on ringing/missed call. During 1am-10am SGT, Telegram only unless life-and-death emergency.\n- Do not call for casual VIP confirmations or non-actionable courtesy messages.\n\nTONE/RULES FOR DRAFTS:\n- Proposed replies must be Christian and grace-based where appropriate: warm, humble, encouraging, non-judgmental, peace-making, not preachy.\n- Never use em dashes in drafted replies or approval messages. Use commas, periods, colons, or parentheses instead.\n- NEVER auto-reply directly on WhatsApp unless Vincent explicitly approves or future config marks that exact category pre-approved.\n\nDelivery rule:\n- For allowed main alerts only: call send_message(target=\"telegram\", message=...). Keep it concise.\n- For non-urgent reply suggestions: use the sidecar only.\n- For everything else: DO NOT call send_message. Keep final response as local-only log.\n", "timestamp": 1780813814.0094123, "anchor": true}, {"id": 104129, "role": "assistant", "content": "**Local log only. No action needed.**\n\nThis is a `reactionMessage` (❤️ emoji reaction) from Jocelyn in group `120363418226875471@g.us`. It's a casual emoji reaction with no text content, not actionable, not from a VIP contact, and not urgent. Ignoring per noise reduction rules.", "timestamp": 1780813814.0912454}], "bookend_end": [], "messages_before": 0, "messages_after": 2}], "count": 10, "sessions_searched": 10}