\
  The most prestigious law school admissions discussion board in the world.
BackRefresh Options Favorite

My Whatsapp group with BIGTECH bros is >> xo these days

It's invite only and everything is AI related and it's 180
https://imgur.com/a/o2g8xYK
  04/28/26
Some CEOdood just poasted this "so I ran some more m...
https://imgur.com/a/o2g8xYK
  04/28/26
CEO bragging about doing setup steps on the max settings ...
Juan Eighty
  04/28/26
You posted this exact thread last week Provide updated sc...
Juan Eighty
  04/28/26
I can't tell what these people are saying a lot of the time ...
https://imgur.com/a/o2g8xYK
  04/29/26
Sounds douchey but potentially interesting. More douchey tho...
oomox
  04/29/26
Speaking as a VC, I will tell you that I hope you aren&rsquo...
https://imgur.com/a/o2g8xYK
  04/29/26
i have a couple of "code parser in the sampling loop&qu...
https://imgur.com/a/o2g8xYK
  04/29/26
At the NVIDIA conf last year, I ran into a liquid cooled ser...
https://imgur.com/a/o2g8xYK
  04/29/26
https://i.imgur.com/IIch0AM.png
https://imgur.com/a/o2g8xYK
  04/29/26
Whoa, nice!
Emotionally + Physically Abusive Ex-Husband
  04/29/26


Poast new message in this thread



Reply Favorite

Date: April 28th, 2026 11:49 PM
Author: https://imgur.com/a/o2g8xYK


It's invite only and everything is AI related and it's 180

(http://www.autoadmit.com/thread.php?thread_id=5861511&forum_id=2)#49850251)



Reply Favorite

Date: April 28th, 2026 11:50 PM
Author: https://imgur.com/a/o2g8xYK


Some CEOdood just poasted this

"so I ran some more math, and newer Xeon machines (Grand Rapids etc) with 1.5TB of RAM (MRDIMM - more throughput than DDR5) can deal with more concurrent users on MoE models like Kimi 2.6 and GLM 5.1 with longer context windows (>200k) though with reduced tok/sec/user for about 1/4-1/5 of the price of B200x8 or 2x(H100x8).

That means that some of my "batching" theories, assuming I can prove dedicated kernels on AMX and AVX512, means that headless coding agents that don't have a human following them, that runs in the background can be executed on said models in a MUCH cheaper way.

running a first test on Grand Rapids CPUs (only managed to get 16 CPUs - GCP won't give me quota).

You would think with the amount of models being downloaded, most clouds should benefit from a proxy for those the same way they do for linux packages..."



(http://www.autoadmit.com/thread.php?thread_id=5861511&forum_id=2)#49850253)



Reply Favorite

Date: April 28th, 2026 11:52 PM
Author: Juan Eighty

CEO bragging about doing setup steps on the max settings

wtf is Grand Rapids Michigan CPU

(http://www.autoadmit.com/thread.php?thread_id=5861511&forum_id=2)#49850260)



Reply Favorite

Date: April 28th, 2026 11:50 PM
Author: Juan Eighty

You posted this exact thread last week

Provide updated screenshots or gtfo

(http://www.autoadmit.com/thread.php?thread_id=5861511&forum_id=2)#49850254)



Reply Favorite

Date: April 29th, 2026 12:48 AM
Author: https://imgur.com/a/o2g8xYK


I can't tell what these people are saying a lot of the time and redacting shit is tedious

https://i.imgur.com/F6Nwlsl.png

(http://www.autoadmit.com/thread.php?thread_id=5861511&forum_id=2)#49850321)



Reply Favorite

Date: April 29th, 2026 12:06 AM
Author: oomox

Sounds douchey but potentially interesting. More douchey tho.

(http://www.autoadmit.com/thread.php?thread_id=5861511&forum_id=2)#49850281)



Reply Favorite

Date: April 29th, 2026 12:32 AM
Author: https://imgur.com/a/o2g8xYK


Speaking as a VC, I will tell you that I hope you aren’t using VC money to buy off brand GPUs. We wall know that the most valuable AI startups use NVIDIA(tm). I have a special relationship with them, actually so we can get you to the front of the queue.

(http://www.autoadmit.com/thread.php?thread_id=5861511&forum_id=2)#49850306)



Reply Favorite

Date: April 29th, 2026 12:37 AM
Author: https://imgur.com/a/o2g8xYK


i have a couple of "code parser in the sampling loop" + "code parser controlled KV cache endpoints" experiments that are pretty promising, also in the context of CPU inference. I don't really have time nor knowledge to properly explore them, but if someone is interested here, I feel this can unlock some pretty interesting idea to keep smaller models on track and just in general exploit the massive power that comes with controlling the inference end to end. I did a couple of experiments and my ideas do work (very messy repo: https://github.com/REDACTED), if someone is interested in a chat. (I think small local models + custom inference harness + tools optimized on trace analysis / GEPA-like/autoresearch loops are potentially competitive with much much much larger models for a fair amount of the "lower level but token consuming" tasks in the context of coding.

roughly the idea is that if you have a AST parser + compiler/interpreter in the loop, you can do the following things:

- flag syntax errors (obv) + easily gaslight the model into thinking it wrote them correctly

- autocomplete a set of patterns to save on inference compute by just prefilling the completion

- rewind the cache / inference pointer (not sure what the proper name is) to semantically meaningful points (say, function starts)

- insert // docstrings that help the model along for trickier APIs

- for certain types of functions / unit tests, run the unit tests right after the code was generated, parallelizing with inference, and you can easily rewind

- on CPU, where you are memory constrained (I guess it's the same on GPUs but I haven't done GPGPU stuff since 2012), you can run speculative loops / split the code writing across multiple cores and then merge

All of these things mostly benefit from it being tailored to your own codebase / being RL-optimized to your codebase through transcript analysis, so model providers who are more reliant on batching similar jobloads and often are "too fast" for this kind of trickery won't really benefit from it.

(http://www.autoadmit.com/thread.php?thread_id=5861511&forum_id=2)#49850311)



Reply Favorite

Date: April 29th, 2026 12:40 AM
Author: https://imgur.com/a/o2g8xYK


At the NVIDIA conf last year, I ran into a liquid cooled server rack vendor I have seen since I was a gothling and set up sun gear in datacenters to pay for fangs and leather jackets. I said “I thought you retired”. He said “no, we’re doing it all one last time. this time for real money”.

(http://www.autoadmit.com/thread.php?thread_id=5861511&forum_id=2)#49850314)



Reply Favorite

Date: April 29th, 2026 1:21 AM
Author: https://imgur.com/a/o2g8xYK


https://i.imgur.com/IIch0AM.png

(http://www.autoadmit.com/thread.php?thread_id=5861511&forum_id=2)#49850376)



Reply Favorite

Date: April 29th, 2026 1:38 AM
Author: Emotionally + Physically Abusive Ex-Husband (oppose bitchbois)

Whoa, nice!

(http://www.autoadmit.com/thread.php?thread_id=5861511&forum_id=2)#49850406)