The audio makes the d sound like D. I wonder if that is because of the previous S. Assimilation?
Assimilation, if it occurs, I think it will be apparent in the spelling. Just to give an example, consider the verbs scaled to افتعل:
- سمع ← استمع
- غرر ← اغترَّ
- لحف ← التحف
- عرف ← اعترف
- ضرب ← اضطرب
- ضرر ← اضطرّ
Here in the last two examples we see how the typical T sound in the scale of افتعل is converted to velar ط to be in line and harmonize with the velar ض.
This said, I still hear it as "D" in the audio. However, I did notice that the speaker in the audio, even though a machine, but it apparently is based on an Egyptian female voice - The tone of the speech is distinctive. In their dialect they do indeed have a tendency sometimes to velarize the "D". Personally, I can spell the syllables صد and صض quite differently and easily.