AMD_performance_monitor
Name
AMD_performance_monitor
Name Strings
GL_AMD_performance_monitor
Contributors
Dan Ginsburg
Aaftab Munshi
Dave Oldcorn
Maurice Ribble
Jonathan Zarge
Contact
Dan Ginsburg (dan.ginsburg 'at' amd.com)
Status
???
Version
Last Modified Date: 11/29/2007
Number
OpenGL Extension #360
OpenGL ES Extension #50
Dependencies
None
Overview
This extension enables the capture and reporting of performance monitors.
Performance monitors contain groups of counters which hold arbitrary counted
data. Typically, the counters hold information on performance-related
counters in the underlying hardware. The extension is general enough to
allow the implementation to choose which counters to expose and pick the
data type and range of the counters. The extension also allows counting to
start and end on arbitrary boundaries during rendering.
Issues
1. Should this be an EGL or OpenGL/OpenGL ES extension?
Decision - Make this an OpenGL/OpenGL ES extension
Reason - We would like to expose this extension in both OpenGL and
OpenGL ES which makes EGL an unsuitable choice. Further, support for
EGL is not a requirement and there are platforms that support OpenGL ES
but not EGL, making it difficult to make this an EGL extension.
2. Should the API support multipassing?
Decision - No.
Reason - Multipassing should really be left to the application to do.
This makes the API unnecessarily complicated. A major issue is that
depending on which counters are to be sampled, the # of passes and which
counters get selected in each pass can be difficult to determine. It is
much easier to give a list of counters categorized by groups with
specific information on the number of counters that can be selected from
each group.
3. Should we define a 64-bit data type for UNSIGNED_INT64_AMD?
Decision - No.
Reason - While counters can be returned as 64-bit unsigned integers, the
data is passed back to the application inside of a void*. Therefore,
there is no need in this extension to define a 64-bit data type (e.g.,
GLuint64). It will be up the application to declare a native 64-bit
unsigned integer and cast the returned data to that type.
New Procedures and Functions
void GetPerfMonitorGroupsAMD(int *numGroups, sizei groupsSize,
uint *groups)
void GetPerfMonitorCountersAMD(uint group, int *numCounters,
int *maxActiveCounters, sizei countersSize,
uint *counters)
void GetPerfMonitorGroupStringAMD(uint group, sizei bufSize, sizei *length,
char *groupString)
void GetPerfMonitorCounterStringAMD(uint group, uint counter, sizei bufSize,
sizei *length, char *counterString)
void GetPerfMonitorCounterInfoAMD(uint group, uint counter,
enum pname, void *data)
void GenPerfMonitorsAMD(sizei n, uint *monitors)
void DeletePerfMonitorsAMD(sizei n, uint *monitors)
void SelectPerfMonitorCountersAMD(uint monitor, boolean enable,
uint group, int numCounters,
uint *counterList)
void BeginPerfMonitorAMD(uint monitor)
void EndPerfMonitorAMD(uint monitor)
void GetPerfMonitorCounterDataAMD(uint monitor, enum pname, sizei dataSize,
uint *data, int *bytesWritten)
New Tokens
Accepted by the <pame> parameter of GetPerfMonitorCounterInfoAMD
COUNTER_TYPE_AMD 0x8BC0
COUNTER_RANGE_AMD 0x8BC1
Returned as a valid value in <data> parameter of
GetPerfMonitorCounterInfoAMD if <pname> = COUNTER_TYPE_AMD
UNSIGNED_INT 0x1405
FLOAT 0x1406
UNSIGNED_INT64_AMD 0x8BC2
PERCENTAGE_AMD 0x8BC3
Accepted by the <pname> parameter of GetPerfMonitorCounterDataAMD
PERFMON_RESULT_AVAILABLE_AMD 0x8BC4
PERFMON_RESULT_SIZE_AMD 0x8BC5
PERFMON_RESULT_AMD 0x8BC6
Addition to the GL specification
Add a new section called Performance Monitoring
A performance monitor consists of a number of hardware and software counters
that can be sampled by the GPU and reported back to the application.
Performance counters are organized as a single hierarchy where counters are
categorized into groups. Each group has a list of counters that belong to
the counter and can be sampled, and a maximum number of counters that can be
sampled.
The command
void GetPerfMonitorGroupsAMD(int *numGroups, sizei groupsSize,
uint *groups);
returns the number of available groups in <numGroups>, if <numGroups> is
not NULL. If <groupsSize> is not 0 and <groups> is not NULL, then the list
of available groups is returned. The number of entries that will be
returned in <groups> is determined by <groupsSize>. If <groupsSize> is 0,
no information is copied. Each group is identified by a unique unsigned int
identifier.
The command
void GetPerfMonitorCountersAMD(uint group, int *numCounters,
int *maxActiveCounters,
sizei countersSize,
uint *counters);
returns the following information. For each group, it returns the number of
available counters in <numCounters>, the max number of counters that can be
active at any time in <maxActiveCounters>, and the list of counters in
<counters>. The number of entries that can be returned in <counters> is
determined by <countersSize>. If <countersSize> is 0, no information is
copied. Each counter in a group is identified by a unique unsigned int
identifier. If <group> does not reference a valid group ID, an
INVALID_VALUE error is generated.
The command
void GetPerfMonitorGroupStringAMD(uint group, sizei bufSize,
sizei *length, char *groupString)
returns the string that describes the group name identified by <group> in
<groupString>. The actual number of characters written to <groupString>,
excluding the null terminator, is returned in <length>. If <length> is
NULL, then no length is returned. The maximum number of characters that
may be written into <groupString>, including the null terminator, is
specified by <bufSize>. If <bufSize> is 0 and <groupString> is NULL, the
number of characters that would be required to hold the group string,
excluding the null terminator, is returned in <length>. If <group>
does not reference a valid group ID, an INVALID_VALUE error is generated.
The command
void GetPerfMonitorCounterStringAMD(uint group, uint counter,
sizei bufSize, sizei *length,
char *counterString);
returns the string that describes the counter name identified by <group>
and <counter> in <counterString>. The actual number of characters written
to <counterString>, excluding the null terminator, is returned in <length>.
If <length> is NULL, then no length is returned. The maximum number of
characters that may be written into <counterString>, including the null
terminator, is specified by <bufSize>. If <bufSize> is 0 and
<counterString> is NULL, the number of characters that would be required to
hold the counter string, excluding the null terminator, is returned in
<length>. If <group> does not reference a valid group ID, or <counter>
does not reference a valid counter within the group ID, an INVALID_VALUE
error is generated.
The command
void GetPerfMonitorCounterInfoAMD(uint group, uint counter,
enum pname, void *data);
returns the following information about a counter. For a <counter>
belonging to <group>, we can query the counter type and counter range. If
<pname> is COUNTER_TYPE_AMD, then <data> returns the type. Valid type
values returned are UNSIGNED_INT, UNSIGNED_INT64_AMD, PERCENTAGE_AMD, FLOAT.
If type value returned is PERCENTAGE_AMD, then this describes a float
value that is in the range [0.0 .. 100.0]. If <pname> is COUNTER_RANGE_AMD,
<data> returns two values representing a minimum and a maximum. The
counter's type is used to determine the format in which the range values
are returned. If <group> does not reference a valid group ID, or <counter>
does not reference a valid counter within the group ID, an INVALID_VALUE
error is generated.
The command
void GenPerfMonitorsAMD(sizei n, uint *monitors)
returns a list of monitors. These monitors can then be used to select
groups/counters to be sampled, to start multiple monitoring sessions and to
return counter information sampled by the GPU. At creation time, the
performance monitor object has all counters disabled. The value of the
PERFMON_RESULT_AVAILABLE_AMD, PERFMON_RESULT_AMD, and
PERFMON_RESULT_SIZE_AMD queries will all initially be 0.
The command
void DeletePerfMonitorsAMD(sizei n, uint *monitors)
is used to delete the list of monitors created by a previous call to
GenPerfMonitors. If a monitor ID in the list <monitors> does not
reference a previously generated performance monitor, an INVALID_VALUE
error is generated.
The command
void SelectPerfMonitorCountersAMD(uint monitor, boolean enable,
uint group, int numCounters,
uint *counterList);
is used to enable or disable a list of counters from a group to be monitored
as identified by <monitor>. The <enable> argument determines whether the
counters should be enabled or disabled. <group> specifies the group
ID under which counters will be enabled or disabled. The <numCounters>
argument gives the number of counters to be selected from the list
<counterList>. If <monitor> is not a valid monitor created by
GenPerfMonitorsAMD, then INVALID_VALUE error will be generated. If <group>
is not a valid group, the INVALID_VALUE error will be generated. If
<numCounters> is less than 0, an INVALID_VALUE error will be generated.
When SelectPerfMonitorCountersAMD is called on a monitor, any outstanding
results for that monitor become invalidated and the result queries
PERFMON_RESULT_SIZE_AMD and PERFMON_RESULT_AVAILABLE_AMD are reset to 0.
The command
void BeginPerfMonitorAMD(uint monitor);
is used to start a monitor session. Note that BeginPerfMonitor calls cannot
be nested. In addition, it is quite possible that given the list of groups
and counters/group enabled for a monitor, it may not be able to sample the
necessary counters and so the monitor session will fail. In such a case,
an INVALID_OPERATION error will be generated.
While BeginPerfMonitorAMD does mark the beginning of performance counter
collection, the counters do not begin collecting immediately. Rather, the
counters begin collection when BeginPerfMonitorAMD is processed by
the hardware. That is, the API is asynchronous, and performance counter
collection does not begin until the graphics hardware processes the
BeginPerfMonitorAMD command.
The command
void EndPerfMonitorAMD(uint monitor);
ends a monitor session started by BeginPerfMonitorAMD. If a performance
monitor is not currently started, an INVALID_OPERATION error will be
generated.
Note that there is an implied overhead to collecting performance counters
that may or may not distort performance depending on the implementation.
For example, some counters may require a pipeline flush thereby causing a
change in the performance of the application. Further, the frequency at
which an application samples may distort the accuracy of counters which are
variant (e.g., non-deterministic based on the input). While the effects
of sampling frequency are implementation dependent, general guidance can
be given that sampling at a high frequency may distort both performance
of the application and the accuracy of variant counters.
The command
void GetPerfMonitorCounterDataAMD(uint monitor, enum pname,
sizei dataSize,
uint *data, sizei *bytesWritten);
is used to return counter values that have been sampled for a monitor
session. If <pname> is PERFMON_RESULT_AVAILABLE_AMD, then <data> will
indicate whether the result is available or not. If <pname> is
PERFMON_RESULT_SIZE_AMD, <data> will contain actual size of all counter
results being sampled. If <pname> is PERFMON_RESULT_AMD, <data> will
contain results. For each counter of a group that was selected to be
sampled, the information is returned as group ID, followed by counter ID,
followed by counter value. The size of counter value returned will depend
on the counter value type. The argument <dataSize> specifies the number of
bytes available in the <data> buffer for writing. If <bytesWritten> is not
NULL, it gives the number of bytes written into the <data> buffer. It is an
INVALID_OPERATION error for <data> to be NULL. If <pname> is
PERFMON_RESULT_AMD and <dataSize> is less than the number of bytes required
to store the results as reported by a PERFMON_RESULT_SIZE_AMD query, then
results will be written only up to the number of bytes specified by
<dataSize>.
If no BeginPerfMonitorAMD/EndPerfMonitorAMD has been issued for a monitor,
then the result of querying for PERFMON_RESULT_AVAILABLE and
PERFMON_RESULT_SIZE will be 0. When SelectPerfMonitorCountersAMD is called
on a monitor, the results stored for the monitor become invalidated and
the value of PERFMON_RESULT_AVAILABLE and PERFMON_RESULT_SIZE queries should
behave as if no BeginPerfMonitorAMD/EndPerfMonitorAMD has been issued for
the monitor.
Errors
INVALID_OPERATION error will be generated if BeginPerfMonitorAMD is unable
to begin monitoring with the currently selected counters.
INVALID_OPERATION error will be generated if BeginPerfMonitorAMD is called
when a performance monitor is already active.
INVALID_OPERATION error will be generated if EndPerfMonitorAMD is called
when a performance monitor is not currently started.
INVALID_VALUE error will be generated if the <group> parameter to
GetPerfMonitorCountersAMD, GetPerfMonitorCounterStringAMD,
GetPerfMonitorCounterStringAMD, GetPerfMonitorCounterInfoAMD, or
SelectPerfMonitorCountersAMD does not reference a valid group ID.
INVALID_VALUE error will be generated if the <counter> parameter to
GetPerfMonitorCounterInfoAMD does not reference a valid counter ID
in the group specified by <group>.
INVALID_VALUE error will be generated if any of the monitor IDs
in the <monitors> parameter to DeletePerfMonitorsAMD do not reference
a valid generated monitor ID.
INVALID_VALUE error will be generated if the <monitor> parameter to
SelectPerfMonitorCountersAMD does not reference a monitor created by
GenPerfMonitorsAMD.
INVALID_VALUE error will be generated if the <numCounters> parameter to
SelectPerfMonitorCountersAMD is less than 0.
New State
Sample Usage
typedef struct
{
GLuint *counterList;
int numCounters;
int maxActiveCounters;
} CounterInfo;
void
getGroupAndCounterList(GLuint **groupsList, int *numGroups,
CounterInfo **counterInfo)
{
GLint n;
GLuint *groups;
CounterInfo *counters;
glGetPerfMonitorGroupsAMD(&n, 0, NULL);
groups = (GLuint*) malloc(n * sizeof(GLuint));
glGetPerfMonitorGroupsAMD(NULL, n, groups);
*numGroups = n;
*groupsList = groups;
counters = (CounterInfo*) malloc(sizeof(CounterInfo) * n);
for (int i = 0 ; i < n; i++ )
{
glGetPerfMonitorCountersAMD(groups[i], &counters[i].numCounters,
&counters[i].maxActiveCounters, 0, NULL);
counters[i].counterList = (GLuint*)malloc(counters[i].numCounters *
sizeof(int));
glGetPerfMonitorCountersAMD(groups[i], NULL, NULL,
counters[i].numCounters,
counters[i].counterList);
}
*counterInfo = counters;
}
static int countersInitialized = 0;
int
getCounterByName(char *groupName, char *counterName, GLuint *groupID,
GLuint *counterID)
{
int numGroups;
GLuint *groups;
CounterInfo *counters;
int i = 0;
if (!countersInitialized)
{
getGroupAndCounterList(&groups, &numGroups, &counters);
countersInitialized = 1;
}
for ( i = 0; i < numGroups; i++ )
{
char curGroupName[256];
glGetPerfMonitorGroupStringAMD(groups[i], 256, NULL, curGroupName);
if (strcmp(groupName, curGroupName) == 0)
{
*groupID = groups[i];
break;
}
}
if ( i == numGroups )
return -1; // error - could not find the group name
for ( int j = 0; j < counters[i].numCounters; j++ )
{
char curCounterName[256];
glGetPerfMonitorCounterStringAMD(groups[i],
counters[i].counterList[j],
256, NULL, curCounterName);
if (strcmp(counterName, curCounterName) == 0)
{
*counterID = counters[i].counterList[j];
return 0;
}
}
return -1; // error - could not find the counter name
}
void
drawFrameWithCounters(void)
{
GLuint group[2];
GLuint counter[2];
GLuint monitor;
GLuint *counterData;
// Get group/counter IDs by name. Note that normally the
// counter and group names need to be queried for because
// each implementation of this extension on different hardware
// could define different names and groups. This is just provided
// to demonstrate the API.
getCounterByName("HW", "Hardware Busy", &group[0],
&counter[0]);
getCounterByName("API", "Draw Calls", &group[1],
&counter[1]);
// create perf monitor ID
glGenPerfMonitorsAMD(1, &monitor);
// enable the counters
glSelectPerfMonitorCountersAMD(monitor, GL_TRUE, group[0], 1,
&counter[0]);
glSelectPerfMonitorCountersAMD(monitor, GL_TRUE, group[1], 1,
&counter[1]);
glBeginPerfMonitorAMD(monitor);
// RENDER FRAME HERE
// ...
glEndPerfMonitorAMD(monitor);
// read the counters
GLint resultSize;
glGetPerfMonitorCounterDataAMD(monitor, GL_PERFMON_RESULT_SIZE_AMD,
sizeof(GLint), &resultSize, NULL);
counterData = (GLuint*) malloc(resultSize);
GLsizei bytesWritten;
glGetPerfMonitorCounterDataAMD(monitor, GL_PERFMON_RESULT_AMD,
resultSize, counterData, &bytesWritten);
// display or log counter info
GLsizei wordCount = 0;
while ( (4 * wordCount) < bytesWritten )
{
GLuint groupId = counterData[wordCount];
GLuint counterId = counterData[wordCount + 1];
// Determine the counter type
GLuint counterType;
glGetPerfMonitorCounterInfoAMD(groupId, counterId,
GL_COUNTER_TYPE_AMD, &counterType);
if ( counterType == GL_UNSIGNED_INT64_AMD )
{
unsigned __int64 counterResult =
*(unsigned __int64*)(&counterData[wordCount + 2]);
// Print counter result
wordCount += 4;
}
else if ( counterType == GL_FLOAT )
{
float counterResult = *(float*)(&counterData[wordCount + 2]);
// Print counter result
wordCount += 3;
}
// else if ( ... ) check for other counter types
// (GL_UNSIGNED_INT and GL_PERCENTAGE_AMD)
}
}
Revision History 11/29/2007 - dginsburg + Clarified the default state of a performance monitor object on creation
11/09/2007 - dginsbur
+ Clarify what happens if SelectPerfMonitorCountersAMD is called on
a monitor with outstanding query results.
+ Rename counterSize to countersSize
+ Remove some ';' typos
06/13/2007 - dginsbur
+ Add language on the asynchronous nature of the API and
counter accuracy/performance distortion.
+ Add myself as the contact
+ Remove INVALID_OPERATION error when countersList is NULL
+ Clarify 64-bit issue
+ Make PERCENTAGE_AMD counters float rather than uint
+ Clarify accuracy distortion on variant counters only
+ Tweak to overview language
06/09/2007 - dginsbur
+ Fill in errors section and make many more errors explicit
+ Fix the example code so it compiles
06/08/2007 - dginsbur
+ Modified GetPerfMonitorGroupString and GetPerfMonitorCounterString to
be more client/server friendly.
+ Modified example.
+ Renamed parameters/variables to follow GL conventions.
+ Modified several 'int' param types to 'sizei'
+ Modifid counters type from 'int' to 'uint'
+ Renamed argument 'cb' and 'cbret'
+ Better documented GetPerfMonitorCounterData
+ Add AMD adornment in many places that were missing it
06/07/2007 - dginsbur
+ Cleanup formatting, remove tabs, make fit in proper page width
+ Add FLOAT and UNSIGNED_INT to list of COUNTER_TYPEs
+ Fix some bugs in the example code
+ Rewrite introduction
+ Clarified Issue 1 reasoning
+ Added Issue 3 regarding use of 64-bit data types
+ Added revision history
03/21/2007 - Initial version written. Written by amunshi.