nptl: Optimize trylock for high cache contention workloads (BZ #33704)

Check lock availability before acquisition to reduce cache line
bouncing.  Significantly improves trylock throughput on multi-core
systems under heavy contention.

Tested on x86_64.

Fixes BZ #33704.

Co-authored-by: Alex M Wells <alex.m.wells@intel.com>
Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
(cherry picked from commit 63716823db)
This commit is contained in:
Sunil K Pandey
2025-12-09 08:57:44 -08:00
parent a94467ce05
commit 9e1a305028

View File

@@ -48,7 +48,8 @@ ___pthread_mutex_trylock (pthread_mutex_t *mutex)
return 0;
}
if (lll_trylock (mutex->__data.__lock) == 0)
if (atomic_load_relaxed (&(mutex->__data.__lock)) == 0
&& lll_trylock (mutex->__data.__lock) == 0)
{
/* Record the ownership. */
mutex->__data.__owner = id;
@@ -71,7 +72,10 @@ ___pthread_mutex_trylock (pthread_mutex_t *mutex)
/*FALL THROUGH*/
case PTHREAD_MUTEX_ADAPTIVE_NP:
case PTHREAD_MUTEX_ERRORCHECK_NP:
if (lll_trylock (mutex->__data.__lock) != 0)
/* Mutex type is already loaded, lock check overhead should
be minimal. */
if (atomic_load_relaxed (&(mutex->__data.__lock)) != 0
|| lll_trylock (mutex->__data.__lock) != 0)
break;
/* Record the ownership. */