Implement a function to compute pairwise Euclidean distances between all vectors in a batch of tensors using PyTorch or TensorFlow, and explain the time and space complexity of your solution.
Interview
How to structure your answer
To compute pairwise Euclidean distances between all vectors in a batch, first expand the input tensor to create two batches (a and b) with broadcasting. Compute squared differences between all pairs, sum along the feature dimension, and take the square root. Use PyTorch's broadcasting and vectorized operations to avoid explicit loops. This approach ensures efficiency and leverages GPU acceleration for large batches.
Sample answer
The solution uses PyTorch's broadcasting to compute pairwise distances. First, expand the input tensor to create two batches (a and b) with shapes (n, 1, d) and (1, n, d), where n is the number of vectors and d is the feature dimension. Subtract b from a to get pairwise differences, square the result, sum across features, and take the square root. Time complexity is O(n²) due to computing all pairs, and space complexity is O(n²) for storing the distance matrix. This avoids explicit loops and uses vectorized operations for efficiency. For a batch of size n, the computation scales quadratically with n.
Key points to mention
- • batch dimension handling
- • avoiding explicit for-loops
- • correct use of tensor operations for efficiency
Common mistakes to avoid
- ✗ incorrectly assuming O(n) time complexity
- ✗ forgetting to square the differences
- ✗ not using batch processing correctly