From what I read elsewhere (random reddit comment), the visible reasoning is just "for show" and isn't the process deepseek used to arrive at the result. But if the reasoning has value, I guess it doesn't matter even if it's fake.
R1's technical report (https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSee...) says the prompt used for training is "<think> reasoning process here </think> <answer> answer here </answer>. User: prompt. Assistant:" This prompt format strongly suggests that the text between <think> is made the "reasoning" and the text between <answer> is made the "answer" in the web app and API (https://api-docs.deepseek.com/guides/reasoning_model). I see no reason why deepseek should not do it this way, if not considering post-generation filtering.
Plus, if you read table 3 of the R1 technical report, which contains an example of R1's chain of thought, its style (going back to re-evaluating the problem) resembles what I actually got in the COT in the web app.
Bad reddit comment though, try pair programming with it.
Reasoning usually comments on your request, extends it, figures out which solution is the best and usable, backtracks if finds issues implementing it, proposes a new solution and verifies that it kinda makes sense.
The result after that could actually look different though for usual questions (i.e. summarised in a way chatgpt answers on questions would look like).
But it is usually very coherent with the code part, so if for example it has to choose from two libraries - it will use the one from the reasoning part, of course.