Track Record · last 365 days
33.7%
5-day hit rate · n=745
-0.61%
20-day alpha vs SPY · n=467
32%
20-day win rate vs benchmark
810
of 1753 calls scored
Top call: SPCE +10.9% over 5d (2026-05-01)
Worst call: SPCE -45.2% over 5d (2026-06-01)
Executive Brief
unlearning is the dominant theme today — 10 stories surfaced, running 7× its baseline.
Lead story: VT-WAM: Visual-Tactile World Action Model for Contact-Rich Manipulation — exposure across MDB, MSFT, GOOGL, NVDA.
Also notable: LACUNA: A Testbed for Evaluating Localization Precision for LLM Unlearning — touches MSFT, GOOGL, META, NVDA, MDB, ARM.
Cross-cutting: evidence is showing up around BYDDF, RIVN, TSLA.
5
Large-cap exposure
5
Market movers
arXiv Preprints10
AX2026-07-02T17:58:36Z
Contact-rich manipulation requires policies to react to local deformation, pressure, slip, and friction, yet these cues are temporally sparse and often invisible in visual observations. Existing visual-tactile policies usually feed tactile observations directly into action prediction, but rarely model tactile deformation dynamics during action generation. In this paper, we introduce VT-WAM, a Visual-Tactile World Action Model that jointly learns future visual prediction, tactile deformation prediction, and action prediction within a unified flow matching framework. In particular, VT-WAM introduces (1) Asymmetric Mixture-of-Transformers (MoT) attention to bridge a first-frame visual anchor with temporal tactile dynamics, and (2) contact-gated Action-Visual-Tactile Attention Guidance (AVTAG) to encourage action queries to rely on tactile evidence during contact phases. Across six real-world contact-rich manipulation tasks, VT-WAM achieves a 71.67% average success rate, outperforming Fast-WAM by 26.67% and OmniVTLA by 35.84%. Ablations demonstrate that modeling tactile deformation dynamics and guiding contact-phase tactile attention are both important for contact-rich tasks. Project website: https://vt-wam.github.io/.
AX2026-07-02T17:58:28Z
Embodied AI models now span vision-language-action (VLA) models and world-action models (WAMs), but practical deployment remains fragmented across model-specific Python stacks, backend assumptions, and robot-side glue code, especially on heterogeneous edge devices. Existing inference runtimes are designed mainly for request-response serving and therefore do not satisfy the runtime contract of embodied deployment: multi-rate execution inside closed-loop control, latency-first batch-1 inference on heterogeneous hardware, and extensible embodied interfaces beyond fixed token I/O. We present Embodied.cpp, a portable C++ inference runtime for embodied models. Based on an architectural analysis of representative VLA models and WAMs, Embodied.cpp captures a shared execution path and organizes it into five layers: input adapters, sequence builders, backbone execution, head plugins, and deployment adapters. The runtime provides modular multi-rate execution, latency-first fused inference, and extensible operator and I/O support, enabling deployment across heterogeneous devices, robots, and simulators through one backend abstraction. We evaluate Embodied.cpp on two VLA models, HY-VLA and pi0.5, and on a preliminary WAM benchmark using a LingBot-VA Transformer block. The VLA deployments achieve successful closed-loop execution with 100.0% and 91.0% task success rates, respectively. The WAM benchmark reduces block memory from 312.2 MiB to 88.1 MiB. These results show that Embodied.cpp improves deployment efficiency while preserving high accuracy across diverse embodied model architectures.
AX2026-07-02T17:55:39Z
Realistic traffic simulation requires agents that imitate logged behavior and can also be steered along interpretable axes. Such controllability enables engineers to isolate variables, reproduce specific edge cases, and test autonomous systems without real-world risk. We introduce Controllable Neural Variational Agents (CNeVA), a controllable simulated-agent framework that learns to infer a per-agent Gaussian behavior latent from per-channel discounted returns via a closed-form conjugate variational update, conditioning a rectified-flow trajectory generator trained on a mixed channel-mask curriculum for classifier-free guidance. To tackle scarcity in reward signals, we propose soft eligibility gates that replace hard binary thresholds with smooth exponential decay, preserving the gradient signal for near-threshold agents. On the Waymo Open Motion Dataset, CNeVA attains competitive realism on the benchmark while exposing per-channel controllability that the higher-ranked imitation models lack. Speed- and acceleration-based steering produces monotone responses without stall-induced reward hacking. Safety controllability is monotone and substantial with the introduction of soft eligibility. We manage to achieve steerable map compliance under a context-residual return measure. Furthermore, our experiment demonstrates that steering metrics must be read alongside physical-plausibility guardrails to avoid reward-hacking confounds.
AX2026-07-02T17:43:09Z
This paper presents QuadRocket, a quadrotor-based rocket prototype that provides a low-cost, low-risk platform for validating advanced thrust-vector control strategies for launch vehicle-type systems. The prototype consists of a cylindrical main body mounted on top of a quadrotor through a universal joint, forming a flying inverted pendulum with non-negligible inertia. For control design, the coupled system is modeled as a single axisymmetric rigid body actuated by a vectored force applied along its longitudinal axis. A reduced-attitude representation on the two sphere is adopted to explicitly exploit the vehicle's axial symmetry and to decouple yaw from the thrust-vector direction. On this model, we derive an adaptive backstepping controller that achieves almost global trajectory tracking in the presence of unknown constant disturbances, while a control-point transformation mitigates non minimum-phase behavior. The quadrotor is then treated as a thrust vector actuator, and a dynamic-surface-based attitude controller is designed to track the desired thrust-vector, accounting for actuation dynamics and avoiding explicit differentiation of virtual control signals. The complete architecture is evaluated in simulation and validated experimentally in an indoor motion-capture arena. Results demonstrate accurate trajectory tracking, effective disturbance compensation, and confirm the suitability of the QuadRocket as a versatile testbed for thrust-vector-controlled robotic vehicles.
AX2026-07-02T17:40:06Z
This paper presents a methodology for learning a control policy to intercept an intruder using the 3D direction unit vector to the intruder and the interceptor state. Prior deep reinforcement learning approaches assume either relative position or distance to the intruder is available, but this information is not readily accessible in real-world applications that employ passive, monocular camera sensors. Instead, we propose a solution that leverages an analytical policy gradient method using differentiable quadrotor dynamics to learn agile interception at speeds up to 10 m/s. The proposed approach outperforms baseline methods that utilize simplified point mass dynamics by an average of 30%.
AX2026-07-02T17:59:56Z
As AI coding agents become more autonomous, they increasingly ship code iteratively, with the codebase persisting across sessions. This persistence creates a new attack surface: a misaligned or prompt-injected agent can distribute attacks across pull requests (PRs) and time its payload for the PR with the best natural cover. To study the resulting dynamics, we introduce Iterative VibeCoding, a setting for AI control, the study of safely deploying capable but potentially untrusted AI. In Iterative VibeCoding, a coding agent builds software over a sequence of PRs in a persistent codebase while pursuing a covert side task. Our benchmark includes two task families: CLI tools and Flask web services, across 20 total task variations. We use Claude Sonnet 4.5 as the attack agent and GPT-4o as the monitor. We compare gradual attacks, which distribute the side task across PRs, against non-gradual attacks concentrated in a single PR. No single monitor is robust to both: which strategy evades best (success while evading the monitor) depends on the monitor type, so a defender cannot close off both gradual and non-gradual attacks with any one monitor. High evasion (>= 65%) generalizes across model attack agent backends (Sonnet 4.5, Gemini 3.1 Pro, Kimi K2.5), confirming this is a property of the persistent-state attack surface rather than a single model's capability. Evasion also remains high across state-of-the-art monitor models and the gap between gradual and non-gradual evasion widens for more capable models. We introduce a stateful link-tracker monitor that tracks suspicious buildup across PRs. On both task families, it detects gradual attacks substantially better than diff monitors that merely see more accumulated history. Combining this stronger monitor with trajectory monitors in a four-monitor ensemble reduces gradual-attack evasion from 93% under the weakest standard diff monitor to 47%.
AX2026-07-02T17:59:52Z
LLMs memorize sensitive training data, including personally identifiable information (PII), creating a pressing need for reliable post hoc removal methods. Unlearning has emerged as a promising solution, with state-of-the-art(SOTA) methods often following a localize-first, unlearn-second paradigm that targets specific model parameters. However, existing benchmarks evaluate unlearning solely at the output level, leaving open the question of whether unlearning truly erases knowledge from a model's parameters or merely obfuscates it, a concern reinforced by the success of resurfacing attacks. To bridge this gap, we introduce LACUNA: the first unlearning testbed with ground-truth parameter-level localization. LACUNA injects PII of synthetic individuals into predefined parameters of 1B and 7B OLMo-based models via masked continual pretraining, enabling direct evaluation of whether unlearning targets the weights responsible for knowledge storage. We use LACUNA to benchmark current SOTA unlearning methods and find that, despite strong output-level performance, existing methods are highly imprecise and susceptible to resurfacing attacks. We further show that when localization is successful, even a simple gradient-based unlearning method achieves strong erasure and robustness to resurfacing attacks, highlighting the importance of precise unlearning. We release LACUNA to complement behavioral evaluations and drive further advances in robust, localization-based unlearning.
AX2026-07-02T17:59:50Z
Many everyday programming tasks resist clean rule-based implementation, such as alerting on important log lines, repairing malformed JSON, or ranking search results by intent, and are increasingly outsourced to large language model APIs at the cost of locality, reproducibility, and price. We propose fuzzy-function programming: compiling such a function from a natural-language specification into a compact, locally-executable neural artifact. We instantiate this paradigm with Program-as-Weights (PAW), in which a 4B compiler trained on FuzzyBench, a 10M-example dataset we release, emits parameter-efficient adapters for a frozen, lightweight interpreter. A 0.6B Qwen3 interpreter executing PAW programs matches the performance of direct prompting of Qwen3-32B, while using roughly one fiftieth of the inference memory and running at 30 tokens/s on a MacBook M3. PAW reframes the foundation model from a per-input problem solver into a tool builder: invoked once per function definition, it produces a small reusable artifact whose subsequent calls per function application are cheap and offline.
AX2026-07-02T17:59:43Z
Despite alignment training, LLMs remain prone to generating unsafe outputs at deployment time. Monitoring outputs online and raising an alarm when safety can no longer be assumed is therefore critical. We study a simple real-time monitor that turns a verifier signal from an external model into an alarm decision by thresholding, with the threshold calibrated via risk control. In experiments on mathematical reasoning and red teaming datasets, we show that this simple design is competitive with more advanced monitors based on sequential hypothesis testing.
AX2026-07-02T17:59:26Z
Understanding and reasoning over long contexts has become a key requirement for deploying large language models (LLMs) in realistic applications. Although recent LLMs support increasingly long context windows, they often fail to use relevant evidence that is already present in the input, revealing a gap between context access and effective context utilization. In this work, we propose Recursive Evidence Replay as LLM Harness for Long-Context Reasoning (RECONTEXT), a training-free inference method for improving long-context reasoning. RECONTEXT uses model-internal relevance signals to construct a query-conditioned evidence pool and replays it before final generation while preserving the full original context. This recursive selection process separates evidence organization from answer generation without training, external memory, or context pruning. We also provide a theoretical analysis based on associative memory, which characterizes the context as a memory store, the question as a retrieval cue, attention as cue-trace association, and replay as trace reactivation. Experiments on eight long-context datasets with 128K context length show that RECONTEXT consistently improves evidence utilization across Qwen3-4B, Qwen3-8B, and Llama3-8B, achieving the best average rank on all three backbones. Code is available at https://github.com/Yanjun-Zhao/ReContext.
Patent Filings5
PT2020-06-02
Some non-limiting examples follow. In modern <b>robotic automation</b>, the part that is dealt with could be very complex and have several features, e.g., holes, on it. The relationship between the features can be used to efficiently learn the robot and/or part positions or other process parameters.
PT2014-12-10
Method for teaching a robot movement (84 - 88 - 90 - 92) using a system comprising . a robot (36, 94), . a robot controller (34, 96) with at least an automatic mode and a teach mode, . a programmable logic controller (PLC) (32) which is connected (38) to the robot controller (34, 96), whereas the &hellip;
PT2019-05-07
The invention relates to a method for programming a robot, in particular a robot comprising a robotic arm, in which method a movement to be performed by the robot is set up preferably in a robot programme by means of a predefined motion template, the motion template is selected from a database &hellip;
PT2025-10-03
A robotic controller for controlling a robotic arm is disclosed, the robotic controller comprising: a first spatial shaping module configured to provide a shaped first-space target motion by convolving a first-space target motion with a pulse train, wherein the first-space target motion defines a &hellip;
PT2026-04-28
The numerical control system (1) includes a numerical control device (5) and a robot control device (6). The numerical control device (5) includes: a robot instruction generation unit (55) that generates robot instructions for each robot instruction block; a robot program start instruction unit ( &hellip;
Accelerating Keywords15
#01
unlearning
7.80
10↑ / base 1.4
#02
intruder
7.67
4↑ / base 0.6
#03
tactile
7.66
7↑ / base 1.0
#04
attacks
7.54
9↑ / base 1.3
#05
evidence
7.52
7↑ / base 1.0
#06
vt-wam
7.47
5↑ / base 0.7
#07
recontext
7.42
5↑ / base 0.7
#08
alarm
7.41
2↑ / base 0.3
#09
outputs
7.41
2↑ / base 0.3
#10
lacuna
7.40
5↑ / base 0.7
#11
deformation
7.38
4↑ / base 0.6
#12
cpp
7.36
5↑ / base 0.7
#13
embodied cpp
7.36
5↑ / base 0.7
#14
long-context
7.34
4↑ / base 0.6
#15
differentiable quadrotor
7.33
2↑ / base 0.3
Market Movers5
#01   ARXIV
IMPACT
6.25
MDB$354.88 (-1.25%) MSFT$390.49 (+1.62%) GOOGL$359.91 (-0.36%) NVDA$194.83 (-1.39%) novelty spike·multi-ticker·large-cap exposure·open-source release·benchmark lead
Why it matters — Rising momentum suggests near-term attention and follow-on activity.
Why it matters — Impacts multiple players/supply chain; effects may propagate.
#02   ARXIV
IMPACT
5.00
MSFT$390.49 (+1.62%) GOOGL$359.91 (-0.36%) META$582.90 (-4.90%) NVDA$194.83 (-1.39%) MDB$354.88 (-1.25%) ARM$315.28 (-6.58%) novelty spike·multi-ticker·large-cap exposure·benchmark lead·real-time/edge·safety/alignment·hardware
Why it matters — Rising momentum suggests near-term attention and follow-on activity.
Why it matters — Impacts multiple players/supply chain; effects may propagate.
#03   ARXIV
IMPACT
3.17
MSFT$390.49 (+1.62%) GOOGL$359.91 (-0.36%) META$582.90 (-4.90%) NVDA$194.83 (-1.39%) MDB$354.88 (-1.25%) novelty spike·multi-ticker·large-cap exposure·open-source release·efficiency/cost·patent application·safety/alignment·data/training
Why it matters — Rising momentum suggests near-term attention and follow-on activity.
Why it matters — Impacts multiple players/supply chain; effects may propagate.
#04   ARXIV
IMPACT
2.75
MSFT$390.49 (+1.62%) GOOGL$359.91 (-0.36%) META$582.90 (-4.90%) NVDA$194.83 (-1.39%) novelty spike·multi-ticker·large-cap exposure·real-time/edge·safety/alignment·data/training·hardware
Why it matters — Rising momentum suggests near-term attention and follow-on activity.
Why it matters — Impacts multiple players/supply chain; effects may propagate.
#05   ARXIV
IMPACT
1.92
MDB$354.88 (-1.25%) MSFT$390.49 (+1.62%) GOOGL$359.91 (-0.36%) multi-ticker·large-cap exposure·benchmark lead·patent application
Why it matters — Impacts multiple players/supply chain; effects may propagate.
Why it matters — Large-cap ties can amplify market impact and adoption.
Equity Signals3
BYDDF
7.52
evidence
RIVN
7.52
evidence
TSLA
7.52
evidence
High-Potential Items5
arXiv
Why it matters — Rising momentum suggests near-term attention and follow-on activity.
arXiv
Why it matters — Rising momentum suggests near-term attention and follow-on activity.
arXiv
Why it matters — Rising momentum suggests near-term attention and follow-on activity.
arXiv
Why it matters — Rising momentum suggests near-term attention and follow-on activity.
arXiv
Why it matters — Research acceleration can foreshadow capabilities entering products.
Google Trends (US / 1m)8
Term
unlearning
intruder
tactile
attacks
evidence
vt-wam
recontext
alarm