Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads.
Mark Kilgard, Principal System Software Engineer
OpenGL for 2015
Page 2
Thirteen new standard OpenGL extensions for 2015
•New ARB extensions
- New shader, texture, and graphics pipeline f...
Page 3
Khronos 2015 Announcement for OpenGL
• August 10, 2015
- At SIGGRAPH
• “A set of OpenGL
extensions will …
expose th...
Page 4
Same Day: NVIDIA has driver with full support
• August 10, 2015
- Tradition that NVIDIA releases “zero
day” driver ...
Page 5
Broad Categories of New OpenGL Functionality
•NEW graphics pipeline operation
•NEW texture mapping functionality
•N...
Page 6
NEW Graphics Pipeline Operation
• Fragment shader interlock
- ARB_fragment_shader_interlock
• Programmable sample p...
Page 7
Fragment Shader Interlock
•NEW extension: ARB_fragment_shader_interlock
- Provides reliable means to read/write fra...
Page 8
Pixel Update Preserves Primitive Rasterization Order
Same Pixel—covered by 3 overlapping primitives
OpenGL requires...
Page 9
Yet Fragment Shading is Massively Parallel
+ 1000’s of other fragments
GPU Fragment Shading: parallel execution of ...
Page 10
Post-Shader Pixel Updates Respect Rasterization Order
+ 1000’s of other fragments
Fragment Shading: parallel execu...
Page 11
However, Shader Access to Framebuffer Unsafe!
+ 1000’s of other fragments
GPU Fragment Shading: parallel execution...
Page 12
Interlock Guarantees Pixel Ordering of Shading
+ ….
GPU Fragment Shading: parallel execution of fragment shader th...
Page 13
Fragment Shader Interlock Example
• We want to draw a grid of Stanford bunnies…
…stamped with a few brick normal m...
Page 14
Motivation: Bullet holes and dynamic scuffs
• Desire: Dynamically add apparently geometric details as “after effec...
Page 15
Screen Space Decal Approach
• Draw scene to G-buffer
- Renders world-space normals to “normal image” framebuffer
•...
Page 16
Screen Space Decal Approach Visualized
Visualization of decal
boxes overlaid on scene
“Normal image”
after blended...
Page 17
GLSL Fragment Interlock Usage
• Fragment interlock portion of surface space decal GLSL fragment shader
beginInvoca...
Page 18
Blend Equation Advanced vs. Shader Interlock
Shader Interlock (2015)
• Advantages
- Arbitrary shading operations a...
Page 19
Programmable Sample Positions
• Conventional OpenGL
- Multisample rasterization has fixed sample positions
• NEW A...
Page 20
Application: Temporal Antialiasing
• Reprogram samples different every frame and render continuously
• Done well, ...
Page 21
Post Depth Coverage
• Normally in OpenGL stencil and depth tests are specified to be after fragment
shader executi...
Page 22
Early Fragment Tests & Post Depth Coverage
rasterizer
fragment
shader
stencil test
depth test
color blending
gl_Sa...
Page 23
Vertex Shader Viewport & Layer Output
• NEW extension ARB_shader_viewport_layer_array
• Previously geometry shader...
Page 24
ES 3.2 Compatibility (tessellation, queries)
• NEW extension ARB_ES3_2_compatibility
• Command to specify bounding...
Page 25
NEW Texture Mapping Functionality
• Texture Reduction Modes: Min/Max
- ARB_texture_filter_minmax
• Sparse Textures...
Page 26
New Texture Reduction Modes: Min/Max
•Standard texturing behavior
- Texture fetch result = weighted average of sam...
Page 27
Application: Maximum Intensity Projection
• Radiologist interpret 3D visualizations
of CT scans
• Volume rendering...
Page 28
Maximum Intensity Projection vs.
Volume Rendering Visualized
Axial view of human middle torso
Volume Rendering Max...
Page 29
Spare Textures Visualized
• Textures can be HUGE
- Think of satellite data
- Or all the terrain in a huge game lev...
Page 30
Sparse Textures, done right
• NEW extension ARB_sparse_texture2
- Builds on prior ARB_spare_texture (2013) extensi...
Page 31
Sparse Texture, done even better
• NEW extension ARB_sparse_texture_clamp
• Adds new GLSL texture fetch variant fu...
Page 32
Sparse Texture Clamp Example
• Naively fetch sparse texture until you get a valid texel
vec4 texel;
int code = spa...
Page 33
NEW Shader Functionality
• OpenGL ES.2 Shading Language Compatibility
- ARB_ES3_2_compatibility
• Parallel Compile...
Page 34
ES 3.2 Compatibility (shader support)
• NEW extension ARB_ES3_2_compatibility
• Just say #version 320 es in your G...
Page 35
Parallel Compile & Link of GLSL
• NEW extension ARB_parallel_shader_compile
- Facilitates OpenGL implementations t...
Page 36
64-bit Integer Data Types in GLSL
• GLSL has had 32-bit integer and 64-bit floating-point for a while…
• Now adds ...
Page 37
Shader Atomic Counter Operations in GLSL
• NEW ARB_shader_atomic_counter_ops extension
- Builds on ARB_shader_atom...
Page 38
Query Clock Counter in GLSL
• NEW extension ARB_shader_clock
• New functions query a free-running “clock”
- 64-bit...
Page 39
Shader Ballot and Broadcast
• NEW extension ARB_shader_ballot
- Assumes 64-bit integers
• Concept
- Group of invoc...
Page 40
GLEW Support Available NOW
•GLEW = The OpenGL Extension Wrangler Library
- Open source library
- http://glew.sourc...
Page 41
• Graphics pipeline operation
•ARB_fragment_shader_interlock
•ARB_sample_locations
•ARB_post_depth_coverage
•ARB_E...
Page 42
GPU Hardware Support
Extension Fermi Kepler Maxwell 1, K1* Maxwell 2, X1*
ARB_ES3_2_compatibility ✓ ✓ ✓ ✓
ARB_para...
Page 43
Thanks
•Multi-vendor effort!
•Particular thanks to specification leads
- Pat Brown (NVIDIA)
- Piers Daniell (NVIDI...
Page 44
How to get OpenGL 2015 drivers now
• NVIDIA developer web site
- https://developer.nvidia.com/opengl-driver
• For ...
Page 45
NVIDIA’s driver also includes OpenGL ES 3.2
• Desktop OpenGL driver can create a compliant ES 3.2 context
- Develo...
Page 46
Conclusions
•NEW standard OpenGL Extensions announced at SIGGRAPH for 2015
•NVIDIA already shipping support for al...
OpenGL for 2015
Upcoming SlideShare
Loading in …5
×

OpenGL for 2015

17,467 views

Published on

Barthold Lichtenbelt presented an abbreviated version of these slides at Khronos BOF at SIGGRAPH on Wednesday, August 12, 2015 in Los Angeles

Published in: Technology
  • thanks
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hello, Welcome to MR. Antonio Hernadez LOAN SERVICE. Are You Desperately in need of a loan help? Have you be denied of a loan from your bank or any Financial Firm? Do you need financial assistance? Do you need a loan to pay off your bills or buy a home? Do you want to have a Business of your own and you need Financial Loan Help? Contact us today for your Financial Loan Help. We are willing to help you out on either Business or Personal Loans. Offer are Available at %2 interest rate. APPLICATION DETAILS First Name:________________________ __ Last Name:_________________________ ___ Gender:_______________________ ________ Marital status:_______________________ Contact Address:______________________ City/Zip code:________________________ Country:_____________________ ________ Date of Birth:________________________ Amount Needed as Loan:________________ Loan Duration:_____________________ ___ Monthly Income/Yearly Income:_________ Occupation:__________________ _______ Purpose for Loan:___________________ House Phone:________________________ ________ Cell Phone:________________________ ________ Have you apply for a loan before? ________ Email: [email protected] Regards, CEO Mr.Antonio Hernadez
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • open gl is awesome technology for game programming and visualization but it is hard
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • for more info about our products : https://orient-tec.com/
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • open gl is awesome technology for game programming and visualization but it is hard
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

OpenGL for 2015

  1. 1. Mark Kilgard, Principal System Software Engineer OpenGL for 2015
  2. Page 2 Thirteen new standard OpenGL extensions for 2015 •New ARB extensions - New shader, texture, and graphics pipeline functionality - Proven standard technology - Mostly existed previously as vendor extensions - Now officially standardized by Khronos - Ensures OpenGL is a proper super-set of ES 3.2 •Not a new core standard update but - Eighth consecutive year of Khronos updates to OpenGL at SIGGRAPH - Also did Vulkan this year  - Core version remains OpenGL 4.5
  3. Page 3 Khronos 2015 Announcement for OpenGL • August 10, 2015 - At SIGGRAPH • “A set of OpenGL extensions will … expose the very latest capabilities of desktop hardware.”
  4. Page 4 Same Day: NVIDIA has driver with full support • August 10, 2015 - Tradition that NVIDIA releases “zero day” driver with full functionality at Khronos announcement - Done for past several OpenGL releases • Ready today for developers to begin coding against latest standard extensions - Technically a “beta” driver but fully functional - Intended for developers - Official support for end-user drivers coming soon
  5. Page 5 Broad Categories of New OpenGL Functionality •NEW graphics pipeline operation •NEW texture mapping functionality •NEW shader functionality
  6. Page 6 NEW Graphics Pipeline Operation • Fragment shader interlock - ARB_fragment_shader_interlock • Programmable sample positions for rasterization - ARB_sample_locations • Post-depth coverage version of sample mask - ARB_post_depth_coverage • Vertex shader viewport & layer output - ARB_shader_viewport_layer_array • Tessellation bounding box - ARB_ES3_2_compatibility Details…
  7. Page 7 Fragment Shader Interlock •NEW extension: ARB_fragment_shader_interlock - Provides reliable means to read/write fragment’s pixel state within a fragment shader - GPU managed, no explicit barriers needed •Uses - Custom blend modes - Deferred shading algorithms - E.g. screen space decals •Adds GLSL functions to begin/end interlock - void beginInvocationInterlockARB(void); - void endInvocationInterlockARB(void); •Why is a fragment shader interlock needed? ... Image credit: David Bookout (Intel), Programmable Blend with Pixel Shader Ordering Shared exponent (rgb9e5) format blending via fragment shader interlock
  8. Page 8 Pixel Update Preserves Primitive Rasterization Order Same Pixel—covered by 3 overlapping primitives OpenGL requires stencil/depth/blend operations be observed to match rendering order, so: Primitive rasterization order rasterized primitive #1 rasterized primitive #2 rasterized primitive #3 , ,
  9. Page 9 Yet Fragment Shading is Massively Parallel + 1000’s of other fragments GPU Fragment Shading: parallel execution of fragment shader threads scores of + other primitives Conventional Approach Batch as many fragments in parallel as possible, maximum efficiency batch executing in parallel
  10. Page 10 Post-Shader Pixel Updates Respect Rasterization Order + 1000’s of other fragments Fragment Shading: parallel execution of fragment shader threads 1st blend 2nd blend 3rd blend Shader results feed fixed-function Pixel Update (stencil test, depth test, & blend)
  11. Page 11 However, Shader Access to Framebuffer Unsafe! + 1000’s of other fragments GPU Fragment Shading: parallel execution of fragment shader threads Pixel updates by fragment shader instances executing in parallel cannot guarantee primitive rasterization order! imageLoad, imageStore Exact behavior varies by GPU and timing dependent for any particular GPU—so both undefined & unreliable
  12. Page 12 Interlock Guarantees Pixel Ordering of Shading + …. GPU Fragment Shading: parallel execution of fragment shader threads scores of + other primitives Interlock Approach Batch but disallow fragments for same pixel in parallel execution of fragment shader interlock + ….+ …. batch #1 batch #2 batch #3
  13. Page 13 Fragment Shader Interlock Example • We want to draw a grid of Stanford bunnies… …stamped with a few brick normal maps … and then bump-map shaded Image credit: Jiho Choi (NVIDIA), GameWorks NormalBlendedDecal example
  14. Page 14 Motivation: Bullet holes and dynamic scuffs • Desire: Dynamically add apparently geometric details as “after effects” Without screen-space decals With screen-space decals Normal Map Normal MapShaded color result Shaded color result Image credit: Pope Kim, Screen Space Decals in Warhammer 40,000: Space Marine
  15. Page 15 Screen Space Decal Approach • Draw scene to G-buffer - Renders world-space normals to “normal image” framebuffer • Draw screen-space box for each screen space decal - If pixel’s world-space position in G-buffer isn’t in box, discard fragment - Avoids drawing decal on incorrect surface (one too close or too far) - Fetch decal’s tangent-space normal from decal’s normal map - Within fragment shader interlock - Fetch pixel’s world-space normal from “normal image” framebuffer - Rotate decal normal to world space - Using tangent basis constructed from world-space normal - Then blend (and renormalize) decal normal with pixel’s normal - Replace pixel’s world-space normal in “normal image” with blended normal • Do deferred shading on G-buffer, using “normal image” perturbed by decals
  16. Page 16 Screen Space Decal Approach Visualized Visualization of decal boxes overlaid on scene “Normal image” after blended normal decals “Normal image” before blended normal decals Brick pattern normal map decals applied to decal boxes Final shaded color result Bunny shading includes brick pattern brick normals blended with fragment shader interlock
  17. Page 17 GLSL Fragment Interlock Usage • Fragment interlock portion of surface space decal GLSL fragment shader beginInvocationInterlockARB(); { // Read “normal image” framebuffer's world space normal vec3 destNormalWS = normalize(imageLoad(uNormalImage, ivec2(gl_FragCoord.xy)).xyz); // Read decal's tangent space normal vec3 decalNormalTS = normalize(textureLod(uDecalNormalTex, uv, 0.0).xyz * 2 - 1); // Rotate decal's normal from tangent space to world space vec3 tangentWS = vec3(1, 0, 0); vec3 newNormalWS = normalize(mat3x3(tangentWS, cross(destNormalWS, tangentWS), destNormalWS) * decalNormalTS); // Blend world space normal vectors vec3 destNewNormalWS = normalize(mix(newNormalWS, destNormalWS, uBlendWeight)); // Write new blended normal into “normal image” framebuffer imageStore(uNormalImage, ivec2(gl_FragCoord.xy), vec4(destNewNormalWS,0)); } endInvocationInterlockARB();
  18. Page 18 Blend Equation Advanced vs. Shader Interlock Shader Interlock (2015) • Advantages - Arbitrary shading operations allowed - Very powerful & general - No explicit barrier needed • Disadvantages - Requires putting color blending in every fragment shader - Lengthens shader - Not orthogonal to multisampling - Fragment shader responsible for reading/writing every color sample - Unavailable for legacy fixed-function - Needs latest GPU generation Blend Equation Advanced (2014) • Advantages - Supports for established blend modes - Same as Photoshop, PDF, Flash, SVG - Optimized for their numeric precision requirements - Orthogonal to fragment shading - Just like conventional blending - Just works with multisampling & sRGB - Works with fixed-function rendering in compatibility context - Same “KHR” extension for OpenGL ES - Available on older hardware - But needs glFramebufferBarrier • Disadvantages - Blend modes limited pre-defined set - Limited to 1 color attachment Similar, but different functionality Each extension makes sense in its intended context
  19. Page 19 Programmable Sample Positions • Conventional OpenGL - Multisample rasterization has fixed sample positions • NEW ARB_sample_locations extension - glFramebufferSampleLocationsfvARB specifies sample positions on sub-pixel grid Default 8x multisample pattern Application-specified 8x multisample pattern, oriented for horizontal sampling Same triangle but covers sample patterns differently
  20. Page 20 Application: Temporal Antialiasing • Reprogram samples different every frame and render continuously • Done well, can double effective antialiasing quality “for free” - Needs vertical refresh synchronization - And app must render at rate matching refresh rate (e.g. 60 Hz) Default 2x multisample pattern Alternative 2x multisample pattern Temporal virtual 4x antialiasing Animated GIF when in slideshow mode
  21. Page 21 Post Depth Coverage • Normally in OpenGL stencil and depth tests are specified to be after fragment shader execution - Allows shader to discard fragments prior to these tests - So avoids the depth and stencil buffer update side-effects of these tests • OpenGL 4.2 add ability for fragment shader to force fragment shader to run after the stencil and depth tests - Part of ARB_shader_image_load_store extension - Indicated in GLSL fragment shader by layout(early_fragment_tests) in; • NEW extension ARB_post_depth_coverage - Controls where fragment shader sample mask gl_SampleMaskIn[] reflect the coverage before or after application of the early depth and stencil tests - Allows shader to know what samples survived stencil & depth tests - What you really want if you are using early fragment tests + sample mask - Indicated in GLSL fragment shader by layout(post_depth_coverage) in;
  22. Page 22 Early Fragment Tests & Post Depth Coverage rasterizer fragment shader stencil test depth test color blending gl_SampleMaskIn rasterizer fragment shader stencil test depth test color blending gl_SampleMaskIn rasterizer fragment shader stencil test depth test color blending gl_SampleMaskIn • Late stencil-depth tests • Rasterizer determines sample mask • Early stencil-depth tests • Rasterizer determines sample mask • Early stencil-depth tests • Post-depth coverage determines sample mask Default behavior layout(early_fragment_tests) in; layout(early_fragment_tests) in; layout(post_depth_coverage) in;
  23. Page 23 Vertex Shader Viewport & Layer Output • NEW extension ARB_shader_viewport_layer_array • Previously geometry shader needed to write viewport index and layer - Forced layered rendering to use geometry shaders - Even if a geometry shader wasn’t otherwise needed • New vertex shader (or tessellation evaluation shader) outputs - out int gl_ViewportIndex - out int gl_Layer
  24. Page 24 ES 3.2 Compatibility (tessellation, queries) • NEW extension ARB_ES3_2_compatibility • Command to specify bounding box for evaluated tessellated vertices in Normalized Device Coordinate (NDC) space - glPrimitiveBoundingBox(float minX, float minY, float minZ, float maxX, float maxY, float maxZ) - Initial space accepts entirety of NDC space (effectively not limiting tessellation) - Implementations may be able to optimize performance, assuming accurate bounds - ES 3.2 added this to make tessellation more friendly to mobile use cases - Hint: Expect today’s desktop GPUs are likely to simply ignore this but API matches ES 3.2 • Bonus: - OpenGL ES 3.2 adds two implementation-dependent constants related to multisample line rasterization - GL_MULTISAMPLE_LINE_WIDTH_RANGE_ARB - GL_MULTISAMPLE_LINE_WIDTH_GRANULARITY_ARB - Same toke values as ES 3.2 - These queries supported for completeness (yawn)
  25. Page 25 NEW Texture Mapping Functionality • Texture Reduction Modes: Min/Max - ARB_texture_filter_minmax • Sparse Textures, done right - ARB_sparse_texture2 • Sparse Texture Clamping - ARB_sparse_texture_clamp Details…
  26. Page 26 New Texture Reduction Modes: Min/Max •Standard texturing behavior - Texture fetch result = weighted average of sampled texel values - What you want for color images, etc. •NEW extension: ARB_texture_filter_minmax - Texture fetch result = minimum or maximum of all sampled texel values •Adds NEW “reduction mode” for texture parameter - Choices: GL_WEIGHTED_AVERAGE_ARB (initial state), GL_MIN, or GL_MAX - Use with glTexParameteri, glSamplerPatameteri, etc. •Example applications - Estimating variance or range when sampling data in textures - Conservative texture sampling - E.g. Maximum Intensity Projection for medical imaging
  27. Page 27 Application: Maximum Intensity Projection • Radiologist interpret 3D visualizations of CT scans • Volume rendering simulates opacity attenuated ray casting - Good for visualizing 3D structure • Maximum Intensity Projection (MIP) rendering shows maximum intensity along any ray - Good for highlighting features without regard to occlusion - Avoids missing significant features Volume rendering Maximum Intensity Projection Texture reduction mode GL_WEIGHTED_AVERAGE_ARB Texture reduction mode GL_MAX Image credit: Fishman et al. Volume Rendering versus Maximum Intensity Projection in CT Angiography: What Works Best, When, and Why
  28. Page 28 Maximum Intensity Projection vs. Volume Rendering Visualized Axial view of human middle torso Volume Rendering Maximum Intensity Projection Good at mapping arterial structure, despite occlusion Provides more 3D feel by accounting for occlusion Image credit: Fishman et al. Volume Rendering versus Maximum Intensity Projection in CT Angiography: What Works Best, When, and Why
  29. Page 29 Spare Textures Visualized • Textures can be HUGE - Think of satellite data - Or all the terrain in a huge game level - Or medical or seismic imaging • We don’t never expect to be looking at everything at once! - When textures are huge, can we just make resident what we need? - YES, that’s sparse texture • ARB_sparse_texture standardized in 2013 - Reflected limitations of original sparse texture hardware implementations - Now we can do better… Mipmap chain of a spare texture Only limited number of pages are resident Image credit: AMD
  30. Page 30 Sparse Textures, done right • NEW extension ARB_sparse_texture2 - Builds on prior ARB_spare_texture (2013) extension - Original concept: intended for enormous textures, allows less than the complete set of “pages” of the texture image set to be resident - Primary limitation: - Fetching non-resident data returned undefined results without indication - So no way to know if non-resident data was fetched - This reflected hardware limitations of the time, fixed in newer hardware • Sparse Texture version 2 is friendly to dynamically detecting non-resident access - Fetch of non-resident data now reliably returns zero values - spareTextureARB GLSL texture fetch functions return residency information integer - And 11 other variations of spareTexture*ARB GLSL functions as well - sparseTexelsResidentARB GLSL function maps returned integer as Boolean residency - Now supports sparse multisample and multisample texture arrays
  31. Page 31 Sparse Texture, done even better • NEW extension ARB_sparse_texture_clamp • Adds new GLSL texture fetch variant functions - Includes 10 additional level-of-detal (LOD) parameter to provide a per-fetch floor on the hardware-computed LOD - I.e. the minimum lodClamp parameter - Sparse texture variants - sparseTextureClampARB, sparseTextureOffsetClampARB, sparseTextureGradClampARB, sparseTextureGradOffsetClampARB - Non-spare texture versions too - textureClampARB, textureOffsetClampARB, textureGradClampARB, textureGradOffsetClampARB • Benefit for sparse texture fetches - Shaders can avoid accessing unpopulated portions of high-resolution levels of detail when knowing texture detail is unpopulated - Either from a priori knowledge - Or feedback from previously executed "sparse" texture lookup functions
  32. Page 32 Sparse Texture Clamp Example • Naively fetch sparse texture until you get a valid texel vec4 texel; int code = spareTextureARB(spare_texture, uv, texel); float minLodClamp = 1; while (!sparseTexelsResidentARB(code)) { code = sparseTextureClampARB(sparseTexture, uv, texel, minLodClamp); minLodClamp += 1.0f; } 1 fetch 2 fetches, 1 missed 3 fetches, 2 missed
  33. Page 33 NEW Shader Functionality • OpenGL ES.2 Shading Language Compatibility - ARB_ES3_2_compatibility • Parallel Compile & Link of GLSL - ARB_parallel_shader_compile • 64-bit Integers Data Types - ARB_gpu_shader_int64 • Shader Atomic Counter Operations - ARB_shader_atomic_counter_ops • Query Clock Counter - ARB_shader_clock • Shader Ballot and Broadcast - ARB_shader_ballot Details…
  34. Page 34 ES 3.2 Compatibility (shader support) • NEW extension ARB_ES3_2_compatibility • Just say #version 320 es in your GLSL shader - Develop and use OpenGL ES 3.2’s GLSL dialect from regular OpenGL - Helps desktop developers target mobile and embedded devices • ES 3.2 GLSL adds functionality already in OpenGL - KHR_blend_equation_advanced, OES_sample_variables, OES_shader_image_atomic, OES_shader_multisample_interpolation, OES_texture_storage_multisample_2d_array, OES_geometry_shader, OES_gpu_shader5, OES_primitive_bounding_box, OES_shader_io_blocks, OES_tessellation_shader, OES_texture_buffer, OES_texture_cube_map_array, KHR_robustness - Notably Shader Model 5.0, geometry & tessellation shaders
  35. Page 35 Parallel Compile & Link of GLSL • NEW extension ARB_parallel_shader_compile - Facilitates OpenGL implementations to distribute GLSL shader compilation and program linking to multiple CPU threads to speed compilation throughput - Allows apps to better manage GLSL compilation overheads - Benefit: Faster load time for new shaders and programs on multi-core CPU systems - Good practice: Construct multiple GLSL shaders/programs—defer querying state or using for as long as possible or completion status is true • Part 1: Tells OpenGL’s GLSL compiler how many CPU threads to use for parallel compilation - void glMaxShaderCompilerThreadsARB(GLuint threadCount) - Initially allows implementation-dependent maximum (initial value 0xFFFFFFFF) - Zero means do not use parallel GLSL complication • Part 2: Shader and program query if compile or link is complete - Call glGetShaderiv or glGetProgramiv on GL_COMPLETION_STATUS_ARB parameter - Returns true when compile is complete, false if still compiling - Unlike other queries, will not block for compilation to complete.
  36. Page 36 64-bit Integer Data Types in GLSL • GLSL has had 32-bit integer and 64-bit floating-point for a while… • Now adds 64-bit integers - NEW extension ARB_gpu_shader_int64 • New data types - Signed: int64_t, i64vec2, i64vec3, i64vec4, - Unsigned: uint64_t, u64vec2, u64vec3, u64vec4 - Supported for uniforms, buffers, transform feedback, and shader input/outputs • Standard library extended to 64-bit integers • Programming interface - Uniform setting - glUniform{1,2,3,4}i{,v}64ARB - glUniform{1,2,3,4}ui{,v}64ARB - Direct state access (DSA) variants as well - glProgramlUniform{1,2,3,4}i{,v}64ARB - glProgramlUniform{1,2,3,4}ui{,v}64ARB - Queries for 64-bit uniform integer data
  37. Page 37 Shader Atomic Counter Operations in GLSL • NEW ARB_shader_atomic_counter_ops extension - Builds on ARB_shader_atomic_counters extension (2011, OpenGL 4.2) - Original atomic counters quite limited - Could only increment, decrement, and query • New operations supported on counters - Addition and subtraction: atomicCounterAddARB, atomicCounterSubtractARB - Minimum and maximum: atomicCounterMinARB, atomicCounterMaxARB - Bitwise operators (AND, OR, XOR, etc.) - atomicCounterAndARB, atomicCounterOrARB, atomicCounterXorARB - Exchange: atomicCounterExchangeARB - Compare and Exchange: atomicCounterCompSwapARB
  38. Page 38 Query Clock Counter in GLSL • NEW extension ARB_shader_clock • New functions query a free-running “clock” - 64-bit monotonically incrementing shader counter - uint64_t clockARB(void) - uvec2 clock2x32ARB(void) - Avoids requiring 64-bit integers, instead returns two 32-bit unsigned integers • Similar to Win32’s QueryPerformanceCounter - But within the GPU shader complex • Can allow shaders to monitor their performance - Details implementation-dependent
  39. Page 39 Shader Ballot and Broadcast • NEW extension ARB_shader_ballot - Assumes 64-bit integers • Concept - Group of invocations (shader threads) which execute in lockstep can do a limited forms of cross-invocation communication via a group broadcast of a invocation value, or broadcast of a bitarray representing a predicate value from each invocation in the group - Allows efficient collective decisions within a group of invocations • New built-in data types - Uniform: gl_SubGroupSizeARB - Integer input: gl_SubGroupInvocationARB - Mask input: gl_SubGroupEqMaskARB, gl_SubGroupGeMaskARB, gl_SubGroupGtMaskARB, gl_SubGroupLeMaskARB, gl_SubGroupLtMaskARB • New GLSL functions - uint64_t ballotARB(bool value)
  40. Page 40 GLEW Support Available NOW •GLEW = The OpenGL Extension Wrangler Library - Open source library - http://glew.sourceforge.net/ - Your one-stop-shop for API support for all OpenGL extension APIs •GLEW 1.13.0 provides API support for all 13 extensions NOW •Thanks to Nigel Stewart and Jon Leech for this
  41. Page 41 • Graphics pipeline operation •ARB_fragment_shader_interlock •ARB_sample_locations •ARB_post_depth_coverage •ARB_ES3_2_compatibility •Tessellation bounding box •Multisample line width query •ARB_shader_viewport_layer_array • Texture mapping functionality •ARB_texture_filter_minmax •ARB_sparse_texture2 •ARB_sparse_texture_clamp • Shader functionality •ARB_ES3_2_compatibility •ES 3.2 shading language support •ARB_parallel_shader_compile •ARB_gpu_shader_int64 •ARB_shader_atomic_counter_ops •ARB_shader_clock •ARB_shader_ballot In Review •OpenGL in 2015 has 13 new standard extensions
  42. Page 42 GPU Hardware Support Extension Fermi Kepler Maxwell 1, K1* Maxwell 2, X1* ARB_ES3_2_compatibility ✓ ✓ ✓ ✓ ARB_parallel_shader_compile ✓ ✓ ✓ ✓ ARB_gpu_shader_int64 ✓ ✓ ✓ ✓ ARB_shader_atomic_counter_ops ✓ ✓ ✓ ✓ ARB_shader_clock ✓ ✓ ✓ ARB_shader_ballot ✓ ✓ ✓ ARB_fragment_shader_interlock ✓ ARB_sample_locations ✓ ARB_post_depth_coverage ✓ ARB_shader_viewport_layer_array ✓ ARB_texture_filter_minmax ✓ ARB_sparse_texture2 ✓ † ARB_sparse_texture_clamp ✓ † * = Tegra driver support later † = assumes OS support for sparse resources
  43. Page 43 Thanks •Multi-vendor effort! •Particular thanks to specification leads - Pat Brown (NVIDIA) - Piers Daniell (NVIDIA) - Slawomir Grajewski (Intel) - Daniel Koch (NVIDIA) - Jon Leech (Khronos) - Timothy Lottes (AMD) - Daniel Rakos (AMD) - Graham Sellers (AMD) - Eric Werness (NVIDIA)
  44. Page 44 How to get OpenGL 2015 drivers now • NVIDIA developer web site - https://developer.nvidia.com/opengl-driver • For Quadro and GeForce - Windows, version 355.58 - Linux, version 355.00.05 - Newer versions may be available Support NVIDIA GPU generations - Maxwell - Many extensions in set, such as ARB_fragment_shader_interlock, needs new Maxwell 2 GPU generation - Example: GeForce 9xx, Titan X, Quadro M6000 - Kepler - Fermi
  45. Page 45 NVIDIA’s driver also includes OpenGL ES 3.2 • Desktop OpenGL driver can create a compliant ES 3.2 context - Develop on a PC, then move your working ES 3.2 code to a mobile device - OpenGL 3.2 is basically Android Extension Pack (AEP), standardized by Khronos now • The extensions below are part of OpenGL ES 3.2 core specification now, but they can still be used in contexts below OpenGL ES 3.2 as extensions on supported hardware: - OES_gpu_shader5 - OES_primitive_bounding_box - OES_shader_io_blocks - OES_tessellation_shader - OES_texture_border_clamp - OES_texture_buffer - OES_texture_cube_map_array - OES_draw_elements_base_vertex - KHR_robustness - EXT_color_buffer_float - KHR_debug - KHR_texture_compression_astc_ldr - KHR_blend_equation_advanced - OES_sample_shading - OES_sample_variables - OES_shader_image_atomic - OES_shader_multisample_interpolation - OES_texture_stencil8 - OES_texture_storage_multisample_2d_array - OES_copy_image - OES_draw_buffers_indexed - OES_geometry_shader
  46. Page 46 Conclusions •NEW standard OpenGL Extensions announced at SIGGRAPH for 2015 •NVIDIA already shipping support for all these extensions - Released same day Khronos announced the functionality •Get latest Maxwell 2 generation GPU to access extensions depending on latest hardware
https://renesans-centr.kiev.ua

https://unc-mps.com.ua

У нашей компании нужный портал про направление https://yarema.ua.

×