As someone who developed chatbots with HMM's and the Transformers algorithms, th...

vjerancrnjak · 2025-11-20T21:17:34 1763673454

Markov Random Fields also do that.

Difference is obviously there but nothing prevents you from undirected conditioning of long range dependencies. There’s no need to chain anything.

The problem from a math standpoint is that it’s an intractable exercise. The moment you start relaxing the joint opt problem you’ll end up at a similar place.