Result 1 to 20 from 245 total
How many threads to spawn during program multithreading? (English)
Cooper, Keith (ed.) et al., Languages and compilers for parallel computing. 23rd international workshop, LCPC 2010, Houston, TX, USA, October 7‒9, 2010. Revised selected papers. Berlin: Springer (ISBN 978-3-642-19594-5/pbk). Lecture Notes in Computer Science 6548, 166-183 (2011).
1
Improving accuracy for matrix multiplications on GPUs. (English)
Sci. Program. 19, No. 1, 3-11 (2011).
2
Exploiting parallelism in matrix-computation kernels for symmetric multiprocessor systems: matrix-multiplication and matrix-addition algorithm optimizations by software pipelining and threads allocation (English)
ACM Trans. Math. Softw. 38, No. 1, 2 (2011).
3
Pruning hardware evaluation space via correlation-driven application similarity analysis (English)
Conf. Computing Frontiers, 4 (2011).
4
Exploitation of nested thread-level speculative parallelism on multi-core systems (English)
Conf. Computing Frontiers, 99-100 (2010).
5
Pretty good accuracy in matrix multiplication with gpus (English)
ISPDC, 49-55 (2010).
6
How many threads to spawn during program multithreading? (English)
LCPC, 166-183 (2010).
7
On the efficacy of call graph-level thread-level speculation (English)
WOSP/SIPEW, 247-248 (2010).
8
Optimizing control flow in loops using interval and dependence analysis. (English)
Des. Autom. Embed. Syst. 13, No. 3, 193-221 (2009).
9
Brain derived vision algorithm on high performance architectures. (English)
Int. J. Parallel Program. 37, No. 4, 345-369 (2009).
10
On the exploitation of loop-level parallelism in embedded applications. (English)
ACM Trans Embed. Comput. Syst. 8, No. 2 (2009).
11
Performance characterization of Itanium$^{\circledR }$ 2-based Montecito processor. (English)
Kaeli, David (ed.) et al., Computer performance evaluation and benchmarking. SPEC benchmark workshop 2009, Austin, TX, USA, January 25, 2009. Proceedings. Berlin: Springer (ISBN 978-3-540-93798-2/pbk). Lecture Notes in Computer Science 5419, 36-56 (2009).
12
Optimizing control flow in loops using interval and dependence analysis (English)
Design Autom. for Emb. Sys. 13, No. 3, 193-221 (2009).
13
Adaptive winograd’s matrix multiplications (English)
ACM Trans. Math. Softw. 36, No. 1 (2009).
14
Directional persistence and the optimality of run-and-tumble chemotaxis (English)
Computational Biology and Chemistry 33, No. 4, 269-274 (2009).
15
Performance characterization of itanium$\textregistered 2$-based montecito processor (English)
SPEC Benchmark Workshop, 36-56 (2009).
16
Efficient scheduling of nested parallel loops on multi-core systems (English)
ICPP, 74-83 (2009).
17
Synchronization optimizations for efficient execution on multi-cores (English)
ICS, 169-180 (2009).
18
Cache-aware partitioning of multi-dimensional iteration spaces (English)
SYSTOR, 15 (2009).
19
Efficient simulation of large-scale spiking neural networks using CUDA graphics processors (English)
IJCNN, 2145-2152 (2009).
20
Result 1 to 20 from 245 total